~ubuntu-branches/debian/experimental/calibre/experimental

« back to all changes in this revision

Viewing changes to recipes/times_online.recipe

  • Committer: Package Import Robot
  • Author(s): Martin Pitt
  • Date: 2012-10-03 23:18:14 UTC
  • mfrom: (1.3.36)
  • Revision ID: package-import@ubuntu.com-20121003231814-i67zl632zxlj4qn1
Tags: 0.9.0+dfsg-1
* New upstream release.
* debian/control, debian/rules: ttf-liberation is no more, move to
  fonts-liberation. Thanks to Kan-Ru Chen! (Closes: #674838)
* debian/calibre.install: Drop pyPdf, not shipped upstream any more.
* debian/control: Add new python-netifaces dependency.

Show diffs side-by-side

added added

removed removed

Lines of Context:
1
1
 
2
2
__license__   = 'GPL v3'
3
 
__copyright__ = '2009-2010, Darko Miletic <darko.miletic at gmail.com>'
 
3
__copyright__ = '2009-2012, Darko Miletic <darko.miletic at gmail.com>'
4
4
'''
5
5
www.thetimes.co.uk
6
6
'''
21
21
    encoding              = 'utf-8'
22
22
    delay                 = 1
23
23
    needs_subscription    = True
 
24
    auto_cleanup          = False    
24
25
    publication_type      = 'newspaper'
25
26
    masthead_url          = 'http://www.thetimes.co.uk/tto/public/img/the_times_460.gif'
26
27
    INDEX                 = 'http://www.thetimes.co.uk'
41
42
 
42
43
    def get_browser(self):
43
44
        br = BasicNewsRecipe.get_browser()
44
 
        br.open('http://www.timesplus.co.uk/tto/news/?login=false&url=http://www.thetimes.co.uk/tto/news/?lightbox=false')
 
45
        br.open('http://www.thetimes.co.uk/tto/news/')
45
46
        if self.username is not None and self.password is not None:
46
 
            data = urllib.urlencode({ 'userName':self.username
 
47
            data = urllib.urlencode({ 
 
48
                                      'gotoUrl' :self.INDEX
 
49
                                     ,'username':self.username
47
50
                                     ,'password':self.password
48
 
                                     ,'keepMeLoggedIn':'false'
49
51
                                   })
50
 
            br.open('https://www.timesplus.co.uk/iam/app/authenticate',data)
 
52
            br.open('https://acs.thetimes.co.uk/user/login',data)
51
53
        return br
52
54
 
53
55
    remove_tags      = [
58
60
    keep_only_tags   = [
59
61
                          dict(attrs={'class':'heading' })
60
62
                         ,dict(attrs={'class':'f-author'})
 
63
                         ,dict(attrs={'class':['media','byline-timestamp']})
61
64
                         ,dict(attrs={'id':'bodycopy'})
62
65
                       ]
63
66
 
79
82
               ,(u'Arts'        , PREFIX + u'arts/?view=list'         )
80
83
            ]
81
84
 
82
 
    def preprocess_html(self, soup):
83
 
        for item in soup.findAll(style=True):
84
 
            del item['style']
85
 
        return self.adeify_images(soup)
86
 
 
87
85
    def parse_index(self):
88
86
        totalfeeds = []
89
87
        lfeeds = self.get_feeds()