The HTMLTags module defines a class for each valid HTML tag, written in uppercase letters. To create a piece of HTML, the general syntax is :
t = TAG(innerHTML, key1=val1,key2=val2,...)
so that "print t" results in :
<TAG key1="val1" key2="val2" ...>innerHTML</TAG>
For instance :
print A('bar', href="foo") ==> <A href="foo">bar</A>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | """Classes to generate HTML in Python
The HTMLTags module defines a class for all the valid HTML tags, written in
uppercase letters. To create a piece of HTML, the general syntax is :
t = TAG(inner_HTML, key1=val1,key2=val2,...)
so that "print t" results in :
<TAG key1="val1" key2="val2" ...>inner_HTML</TAG>
For instance :
print A('bar', href="foo") ==> <A href="foo">bar</A>
To generate HTML attributes without value, give them the value True :
print OPTION('foo',SELECTED=True,value=5) ==>
<OPTION value="5" SELECTED>
The inner_HTML argument can be an instance of an HTML class, so that
you can nest tags, like this :
print B(I('foo')) ==> <B><I>foo</I></B>
TAG instances support addition :
print B('bar')+INPUT(name="bar") ==> <B>bar</B><INPUT name="bar">
and repetition :
print TH(' ')*3 ==> <TD> </TD><TD> </TD><TD> </TD>
For complex expressions, a tag can be nested in another using the operator <=
Considering the HTML document as a tree, this means "add child" :
form = FORM(action="foo")
form <= INPUT(name="bar")
form <= INPUT(Type="submit",value="Ok")
If you have a list (or any iterable) of instances, you can't concatenate the
items with sum(instance_list) because sum takes only numbers as arguments. So
there is a function called Sum() which will do the same :
Sum( TR(TD(i)+TD(i*i)) for i in range(100) )
generates the rows of a table showing the squares of integers from 0 to 99
A simple document can be produced by :
print HTML( HEAD(TITLE('Test document')) +
BODY(H1('This is a test document')+
'First line'+BR()+
'Second line'))
This will be rendered as :
<HTML>
<HEAD>
<TITLE>Test document</TITLE>
</HEAD>
<BODY>
<H1>This is a test document</H1>
First line
<BR>
Second line
</BODY>
</HTML>
If the document is more complex it is more readable to create the elements
first, then to print the whole result in one instruction. For example :
head = HEAD()
head <= TITLE('Record collection')
head <= LINK(rel="Stylesheet",href="doc.css")
title = H1('My record collection')
table = TABLE()
table <= TR(TH('Title')+TH('Artist'))
for rec in records:
row = TR()
# note the attribute key Class with leading uppercase
# because "class" is a Python keyword
row <= TD(rec.title,Class="title")+TD(rec.artist,Class="artist")
table <= row
print HTML(head+BODY(title+table))
"""
import cStringIO
class TAG:
"""Generic class for tags"""
def __init__(self, inner_HTML="", **attrs):
self.tag = self.__class__.__name__
self.inner_HTML = inner_HTML
self.attrs = attrs
self.children = []
self.brothers = []
def __str__(self):
res=cStringIO.StringIO()
w=res.write
if self.tag != "TEXT":
w("<%s" %self.tag)
# attributes which will produce arg = "val"
attr1 = [ k for k in self.attrs
if not isinstance(self.attrs[k],bool) ]
w("".join([' %s="%s"'
%(k.replace('_','-'),self.attrs[k]) for k in attr1]))
# attributes with no argument
# if value is False, don't generate anything
attr2 = [ k for k in self.attrs if self.attrs[k] is True ]
w("".join([' %s' %k for k in attr2]))
w(">")
if self.tag in ONE_LINE:
w('\n')
w(str(self.inner_HTML))
for child in self.children:
w(str(child))
if self.tag in CLOSING_TAGS:
w("</%s>" %self.tag)
if self.tag in LINE_BREAK_AFTER:
w('\n')
if hasattr(self,"brothers"):
for brother in self.brothers:
w(str(brother))
return res.getvalue()
def __le__(self,other):
"""Add a child"""
if isinstance(other,str):
other = TEXT(other)
self.children.append(other)
other.parent = self
return self
def __add__(self,other):
"""Return a new instance : concatenation of self and another tag"""
res = TAG()
res.tag = self.tag
res.inner_HTML = self.inner_HTML
res.attrs = self.attrs
res.children = self.children
res.brothers = self.brothers + [other]
return res
def __radd__(self,other):
"""Used to add a tag to a string"""
if isinstance(other,str):
return TEXT(other)+self
else:
raise ValueError,"Can't concatenate %s and instance" %other
def __mul__(self,n):
"""Replicate self n times, with tag first : TAG * n"""
res = TAG()
res.tag = self.tag
res.inner_HTML = self.inner_HTML
res.attrs = self.attrs
for i in range(n-1):
res += self
return res
def __rmul__(self,n):
"""Replicate self n times, with n first : n * TAG"""
return self*n
# list of tags, from the HTML 4.01 specification
CLOSING_TAGS = ['A', 'ABBR', 'ACRONYM', 'ADDRESS', 'APPLET',
'B', 'BDO', 'BIG', 'BLOCKQUOTE', 'BUTTON',
'CAPTION', 'CENTER', 'CITE', 'CODE',
'DEL', 'DFN', 'DIR', 'DIV', 'DL',
'EM', 'FIELDSET', 'FONT', 'FORM', 'FRAMESET',
'H1', 'H2', 'H3', 'H4', 'H5', 'H6',
'I', 'IFRAME', 'INS', 'KBD', 'LABEL', 'LEGEND',
'MAP', 'MENU', 'NOFRAMES', 'NOSCRIPT', 'OBJECT',
'OL', 'OPTGROUP', 'PRE', 'Q', 'S', 'SAMP',
'SCRIPT', 'SELECT', 'SMALL', 'SPAN', 'STRIKE',
'STRONG', 'STYLE', 'SUB', 'SUP', 'TABLE',
'TEXTAREA', 'TITLE', 'TT', 'U', 'UL',
'VAR', 'BODY', 'COLGROUP', 'DD', 'DT', 'HEAD',
'HTML', 'LI', 'P', 'TBODY','OPTION',
'TD', 'TFOOT', 'TH', 'THEAD', 'TR']
NON_CLOSING_TAGS = ['AREA', 'BASE', 'BASEFONT', 'BR', 'COL', 'FRAME',
'HR', 'IMG', 'INPUT', 'ISINDEX', 'LINK',
'META', 'PARAM']
# create the classes
for tag in CLOSING_TAGS + NON_CLOSING_TAGS + ['TEXT']:
exec("class %s(TAG): pass" %tag)
def Sum(iterable):
"""Return the concatenation of the instances in the iterable
Can't use the built-in sum() on non-integers"""
it = [ item for item in iterable ]
if it:
return reduce(lambda x,y:x+y, it)
else:
return ''
# whitespace-insensitive tags, determines pretty-print rendering
LINE_BREAK_AFTER = NON_CLOSING_TAGS + ['HTML','HEAD','BODY',
'FRAMESET','FRAME',
'TITLE','SCRIPT',
'TABLE','TR','TD','TH','SELECT','OPTION',
'FORM',
'H1', 'H2', 'H3', 'H4', 'H5', 'H6',
]
# tags whose opening tag should be alone in its line
ONE_LINE = ['HTML','HEAD','BODY',
'FRAMESET'
'SCRIPT',
'TABLE','TR','TD','TH','SELECT','OPTION',
'FORM',
]
if __name__ == '__main__':
head = HEAD(TITLE('Test document'))
body = BODY()
body <= H1('This is a test document')
body <= 'First line' + BR() + 'Second line'
print HTML(head + body)
|
This module makes it easier to produce HTML than writing the raw HTML code in strings. Since opening and closing tags are generated, the resulting HTML should be clean, with no risk of forgetting to close a tag or misspelling a tag
Tags: web
Tags should be in lowercase. According to the xhtml specification, html tags must be in lowercase. This makes them compatible with xml.
"Prior Art" - not patentable :-). Something very similar was done some years ago by Andy Dustman in his HyperText package :
http://dustman.net/andy/python/HyperText
I use it instead of HTMLgen to create web pages in cgi scripts, and it works very well. Too bad it's not better known ! Its object-oriented approach (nesting calls like you nest tags in HTML) makes it simpler and more natural to use than Pierre's solution above, IMHO.
HyperText. The idea is so simple that I was surprised no-one had done this before, I googled for "generate HTML in Python" which returned HTMLGen and templating systems, but not HyperText. It is indeed almost the same, with only a slight difference in the syntax (it uses TAG(*args,attrs) instead of TAG(arg1+arg2+...,attrs) ) ; besides, it's a complete package, also supporting SGML, XHTML etc. Thanks for mentioning it, it deserves to be better known
By the way, nesting tags - TABLE(TR(TD('foo')+TD('bar'))) - is also supported by HTMLTags
Pyhtmloo. Some times ago I've just written a similar python module called pyhtmloo. It does more or less the same as yours. I've even write a parser, that build pyhtmloo objects from an html page. This way the circle is closed ;-).
You can check it at: http://pyhtmloo.sourceforge.net
Did it myself too. Just yesterday I wrote nearly the same module - it's easier to write it than to find it :) I also added ability to add and access items and attributes with shift and indexing, and __str__() for converting to text, e.g.:
HyperText's roots. HyperText does have it's root's in HTMLGen. It's not based on HTMLGen, but it does borrow a few ideas. Another more modern package which is similar, but has much more support for XML in general is XIST. http://www.livinglogic.de/Python/xist/
Bug fix. Version 1.6 fixes a bug in the __mul__ method
Revision 8 : add the __radd__ method, allowing for a syntax like
I tried your recipe and I have the following query with respect to the code below. Naively that would create the same page twice... but the body gets added to the head variable when called a second time.
Is this a feature or I'm missing something? I can see that this would happen from the add operator because it adds to self. But I'm no expert I have only programming for a few hours.
Perhaps I'm not using the tags correctly.
Cheers, Al
The revision posted today fixes the bug mentioned by Al
It also introduces another, more readable syntax to nest tags : instead of using parenthesis, which can become difficult to read, a tag can be nested in another one using the <= operator, like in
Here the operator can be read as a method "add child" to the left operand. It has no logical relation with "less or equal", but graphically it has the shape of a left arrow, which makes it natural for the "add child" method