~ubuntu-branches/ubuntu/karmic/calibre/karmic

« back to all changes in this revision

Viewing changes to src/calibre/web/feeds/recipes/recipe_publico.py

  • Committer: Bazaar Package Importer
  • Author(s): Martin Pitt
  • Date: 2009-07-30 12:49:41 UTC
  • mto: This revision was merged to the branch mainline in revision 13.
  • Revision ID: james.westby@ubuntu.com-20090730124941-kviipg9ypwgppulc
Tags: upstream-0.6.3+dfsg
ImportĀ upstreamĀ versionĀ 0.6.3+dfsg

Show diffs side-by-side

added added

removed removed

Lines of Context:
 
1
"""
 
2
publico.py - v1.0
 
3
 
 
4
Copyright (c) 2009, David Rodrigues - http://sixhat.net
 
5
All rights reserved.
 
6
"""
 
7
 
 
8
__license__ = 'GPL 3'
 
9
 
 
10
from calibre.web.feeds.news import BasicNewsRecipe
 
11
import re
 
12
 
 
13
class Publico(BasicNewsRecipe):
 
14
    title          = u'P\xfablico'
 
15
    __author__     = 'David Rodrigues'
 
16
    oldest_article = 1
 
17
    max_articles_per_feed = 30
 
18
    encoding='utf-8'
 
19
    no_stylesheets = True
 
20
    language = _('Portuguese')
 
21
    preprocess_regexps = [(re.compile(u"\uFFFD", re.DOTALL|re.IGNORECASE),  lambda match: ''),]
 
22
 
 
23
    feeds          = [
 
24
                        (u'Geral', u'http://feeds.feedburner.com/PublicoUltimaHora'),
 
25
                        (u'Internacional', u'http://www.publico.clix.pt/rss.ashx?idCanal=11'),
 
26
                        (u'Pol\xedtica', u'http://www.publico.clix.pt/rss.ashx?idCanal=12'),
 
27
                        (u'Ci\xcencias', u'http://www.publico.clix.pt/rss.ashx?idCanal=13'),
 
28
                        (u'Desporto', u'http://desporto.publico.pt/rss.ashx'),
 
29
                        (u'Economia', u'http://www.publico.clix.pt/rss.ashx?idCanal=57'),
 
30
                        (u'Educa\xe7\xe3o', u'http://www.publico.clix.pt/rss.ashx?idCanal=58'),
 
31
                        (u'Local', u'http://www.publico.clix.pt/rss.ashx?idCanal=59'),
 
32
                        (u'Media e Tecnologia', u'http://www.publico.clix.pt/rss.ashx?idCanal=61'),
 
33
                        (u'Sociedade', u'http://www.publico.clix.pt/rss.ashx?idCanal=62')
 
34
                    ]
 
35
    remove_tags    = [dict(name='script'), dict(id='linhaTitulosHeader')]
 
36
    keep_only_tags = [dict(name='div')]
 
37
 
 
38
    def print_version(self,url):
 
39
        s=re.findall("id=[0-9]+",url);
 
40
        return "http://ww2.publico.clix.pt/print.aspx?"+s[0]