Welcome, guest | Sign In | My Account | Store | Cart

Fetch diary entries from Advogato (Python recipe) by Itamar Shtull-Trauring
ActiveState Code (http://code.activestate.com/recipes/65223/)

Advogato (http://www.advogato.org) exports members' diaries in a simple XML format. This script fetches the entries and stores them in a dictionary keyed by date. I assume it can also be used with other virgule sites, such as http:///www.badvogato.org.

      #!/usr/bin/env python

import sgmllib, string, urllib

class DiaryParser(sgmllib.SGMLParser):
    
    def __init__(self):
        sgmllib.SGMLParser.__init__(self)
        self.entries = []
        self.dates = [] 
        self.inHtml = 0
        self.inDate = 0
        self.data = ""
        
    def handle_data(self, data):
        self.data = self.data + data
    
    def unknown_starttag(self, tag, attrs):
        pass
                
    def unknown_endtag(self, tag):
        pass

    def start_html(self, attributes):
        self.inHtml = 1
        self.data = ""
        self.setliteral()
    
    def end_html(self):
        self.entries.append(self.data)
        self.inHtml = 0
    
    def start_date(self, attributes):
        self.data = ""
        self.setliteral()
    
    def end_html(self):
        self.entries.append(self.data)
        self.inHtml = 0
    
    def start_date(self, attributes):
        self.data = ""
        self.inDate = 1
    
    def end_date(self):
        self.dates.append(self.data)
        self.inDate = 0
        

def getEntries(person):
    """ Fetch a Advogato member's diary and return a dictionary in the form
        { date : entry, ... } 
    """
    
    parser = DiaryParser()
    f = urllib.urlopen("http://www.advogato.org/person/%s/diary.xml" % urllib.quote(person))
    
    s = f.read(8192)
    while s:
        parser.feed(s)
        s = f.read(8192)
    
    parser.close()
    result = {}
    for d, e in map(None, parser.dates, parser.entries):
        result[d] = e
    return result


if __name__=='__main__':
    import sys
    print getEntries(sys.argv[1])

      

Tags: web

Created by Itamar Shtull-Trauring on Mon, 18 Jun 2001 (PSF)

◄	Python recipes (4591)	►
◄	Itamar Shtull-Trauring's recipes (4)	►

Required Modules

Other Information and Tasks

Licensed under the PSF License
Viewed 5549 times
Revision 1

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

Fetch diary entries from Advogato (Python recipe) by Itamar Shtull-Trauring ActiveState Code (http://code.activestate.com/recipes/65223/)