Welcome, guest | Sign In | My Account | Store | Cart

Notice! PyPM is being replaced with the ActiveState Platform, which enhances PyPM’s build and deploy capabilities. Create your free Platform account to download ActivePython or customize Python with the packages you require and get automatic updates.

Download
ActivePython
INSTALL>
pypm install newspeak

How to install newspeak

  1. Download and install ActivePython
  2. Open Command Prompt
  3. Type pypm install newspeak
 Python 2.7Python 3.2Python 3.3
Windows (32-bit)
Windows (64-bit)
Mac OS X (10.5+)
Linux (32-bit)
0.1 Available View build log
Linux (64-bit)
0.1 Available View build log
 
Lastest release
version 0.1 on Apr 12th, 2013
https://secure.travis-ci.org/bitsoffreedom/newspeak.png?branch=master

What it does

Newspeak is a feed aggregator with advanced features for keyword filtering and link content extraction, implemented as a standaloone Django application.

Architecture

Newspeak performs the following tasks (in order):

  1. Fetch specified RSS/Atom feeds as per the Feed model (in parallel).
  2. Parses the feeds using feedparser.
  3. (Optionally) applies per-feed inclusive/exclusive keyword filters on the title and/or summary, based on the KeywordFilter model.
  4. (Optionally) extract summary data using an XPath expression from feed entry's link URL, using lxml.
  5. (Optionally) extract enclosure information using XPath expressions from the feed entry's link URL, using lxml.
  6. Store the resulting feed information locally in a database.
  7. Serve the aggregate of all the feed entries in a single RSS/Atom feed.

The flow of feed data through the application is roughly as follows (given some example feeds and keyword filters):

[Feed 1]-[Keyword filter 1]-[Keyword filter 2]-[XPath content extraction]-----------------------------`\
[Feed 2]--------------------[Keyword filter 3]-[XPath summary extraction]-[XPath content extraction ] -+--[Aggregate output feed]
[Feed 3]-[Keyword filter 3]-[Keyword filter 4]---------------------------------------------------------/

Installing

Getting started with newspeak is really easy thanks to David Cramer's awesome logan for making standalone Django apps. Simply perform the following steps:

  1. Install such that you can easily code along:

    pip install -e \
      git+https://github.com/bitsoffreedom/newspeak.git#egg=newspeak
    

    If you're smart and like to keep your Python environment clean, do this in a VirtualEnv.

  2. Initialize configuration in ~/.newspeak/newspeak.conf.py:

    newspeak init
    
  3. Perform (optional) configuration by editing the settings file. Because Newspeak is based on Django, all available Django settings can be used. Furthermore, there are some Newspeak-specific settings:

    • NEWSPEAK_THREADS: The number of (lightweight) threads used for crawling feed data.
    • NEWSPEAK_METADATA: Metadata used in the generated output feed.

    For a more thorough description and an example of these settings, please have a look at the initial settings file generated in the previous step.

  4. (Optionally) Run the tests:

    newspeak test newspeak
    

    This might take a while, so go fetch a cup of coffee. If something fails, please supply the output of the command newspeak test newspeak --traceback in an issue on GitHub.

  5. Create admin user and SQLite database (proper database is optional):

    newspeak syncdb --migrate
    
  6. Start the local webserver:

    newspeak run_gunicorn
    
  7. Open http://127.0.0.1:8000/admin/ in your browser, add some feed. Only the URL is required, the description and title will be fetched automatically, as well as the first set of entries.

  8. (Optionally) Configure one or more keyword-based filters for your feed(s).

  9. Make sure the following command gets executed to update the feeds:

    newspeak update_feeds
    

    (Optionally, add -v <1|2|3> to get more feedback on the process.)

  10. Look at the pretty feeds: open http://127.0.0.1:8000/all/rss/ or http://127.0.0.1:8000/all/atom/ in your favorite feed reader. All input feeds will be aggregated there.

    Alternatively, the original feeds, keywords and XPath expressions as used by Bits of Freedom are contained in a fixture called feeds_bof.json. This fixture can be loaded using:

    newspeak loaddata feeds_bof
    
  11. Setup a Cronjob to automatically update the feed data using the newspeak update_feeds command. For example, a cron job updating the feeds every hour could look as follows:

    0 * * * *  <full_path_to_>/newspeak update_feeds
    

Upgrading

  1. Run the PIP installation command again:

    pip install -e \
      git+https://github.com/bitsoffreedom/newspeak.git#egg=newspeak
    
  2. (Optionally) Run the tests:

    newspeak test newspeak
    
  3. Apply any database migrations:

    newspeak migrate
    

Subscribe to package updates

Last updated Apr 12th, 2013

What does the lock icon mean?

Builds marked with a lock icon are only available via PyPM to users with a current ActivePython Business Edition subscription.

Need custom builds or support?

ActivePython Enterprise Edition guarantees priority access to technical support, indemnification, expert consulting and quality-assured language builds.

Plan on re-distributing ActivePython?

Get re-distribution rights and eliminate legal risks with ActivePython OEM Edition.