Hint
If you're reading this file as pure ReST text, use the
buildDoc script
provided with the source distribution of XMLObject.
Are you bored fighting with SAX and DOM to deal with your XML data ?
XMLObject combines the best Object Oriented features of Python with XML
managing. Declare your XMLObjects, you don't have to know the internal
cooking, you deal with Objects, you get XML. That's all.
If your looking for some kind of persistance system in XMLObject, you
may be desapointed. XMLObjects are basically designed to handle and
store data. More complex concepts like circular references - which
seem to be vital for persistance management - lack in
XMLObject. Though if you persist (:)) have a look to YAML and other
real persistance systems.
XMLObject is at an early stage development. So it currently doesn't
provide support for all XML goodies described in the W3C papers. The
effort is made on simplifying XML management. Current planned features
are:
- xmlns support
- DTD to XMLObject transition:
- recursivity handling
- code generation (you provide a DTD, XMLObjects are generated)
- validation
- XML Schema support
There's nothing much to know ... Anyway a little case study may clarify
few minds. What about an XML Playlist system ?
Declaring XMLObject classes can be compared to writing a DTD. But it's
far more readable than DTD. Though for now XMLObject is less powerfull
than DTD to handle all XML grammar possibilities. Here is how the
example looks like:
class Song(XMLObject):
file = StringAttribute()
length = IntegerAttribute()
artist = StringAttribute(optional=True)
title = TextNode(default = 'Track Name')
comment = CommentNode(default = 'Blah blah')
class Playlist(XMLObject):
_entities = [ ('&xml;','eXtensible Markup Langage')]
name = StringAttribute()
songs = ListNode('Song')
As you can see, there's no real XML reference here, except the
'XMLObject' :-) There are few types of Nodes:
- The attributes (e.g <obj attr="something" />):
- CDATAttribute : string data
- NMTokenAttribute : any string except :_-. characters
- NMTokensAttribute : NMTokenAttribute without space nor tabs nor
carriage return characters
- StringAttribute : any string data (just like CDATAttribute but
with a more friendly name)
- IntegerAttribute : integer data
- The content:
- TextNode : storing 'string' data
- RawNode : storing string-with-annoying-characters (<,>,&,...) in
a CDATA section (<![CDATA[blah < > é]]>)
- ItemNode : referring to another XMLObject
- ListNode : build a list storing XMLObjects of a given type
- ChoiceNode : like ItemNode, but can refer to itself or any Type of Node
- CommentNode : inserting comments in the XML (<!-- ... -->)
Each Node can be set to optional using the dedicated boolean keyword
parameter optional. One can also modify default Node value, still
by passing it to the Node constructor. Few XMLObject class attributes
may be overriden:
- _name : providing a new string to identify the XMLObject instead
of the class name. If _name contains some space characters, they
are replace by underscores.
- _entities : a list storing tuple entities
(e.g, ('&toBeReplaced;', 'this is very very long data'))
XMLObjects are instantiable through their constructor. XMLObject
members (like name, title, artist in playlist example) can be
set by passing values to the constructor:
s1 = Song(file='foobar.ogg', length=300, artist='foo', title='Bar')
pList = Playlist(songs=[s1])
pList.name = 'My Favorites in &xml;'
ListNode behaves just as Python lists. As you can see in Song
class declaration, the artist attribute is optional. Then you can
omit to set it:
s2 = Song(file='opensource.ogg', comment='hey man it rocks')
s2.length = 250
pList.songs.append(s2)
Well the final goal of XMLObject is to output some XML data. To do so,
use the toXml method. If you want to build an XMLObject given its
XML representation, use the fromXml class method.
xmlPlaylist = pList.toXml()
pList2 = Playlist.fromXml(xmlPlaylist)
print pList2.toXml()
assert xmlPlaylist == pList2.toXml(), 'Pb during import/export'
Here is some XML output :
<?xml version="1.0" encoding="iso-8859-1" ?>
<playlist name="My Favorites in eXtensible Markup Langage">
<song artist="foo" file="foobar.ogg" length="300">
<!-- Blah blah -->
<title>
Bar
</title>
</song>
<song file="opensource.ogg" length="250">
<!-- hey man it rocks -->
</comment>
<title>
Track Name
</title>
</song>
</playlist>
Keyword paramaters can be passed to toXml method:
- headers: boolean switch to tell if you want the <? ?> processing
instruction.
- tabLength: integer indicating the tabulation length (2 by default)
- encoding charset to put in initial processing instruction
(iso8859-1 by default)
Voila. Pretty simple, no ?
After the previous introduction, we'll study an example of XML from
the Zvon DTD tutorial. It's not as usefull as the playlist system
but it shows how to design XMLObjects from a DTD fragment. BTW this
example is in the unit-tests too. So let's see the XMLObjects:
class XXX(XMLObject):
aaa = ListNode('AAA',optional=False)
bbb = ListNode('BBB',optional=False)
class AAA(XMLObject):
mainNode = ChoiceNode(['BBB','CCC'])
class BBB(XMLObject):
mainNode = ChoiceNode(['#PCDATA', 'CCC'],optional=True, noLimit=True)
class CCC(XMLObject):
mainNode = TextNode()
Following the Zvon example, we start putting some data in the XMLObjects:
xxx = XXX()
xxx.aaa.append(CCC('Precisely one element.'))
bb = BBB()
bb.append(CCC())
bb.append(CCC())
bb.append(CCC())
xxx.aaa.append(bb)
xxx.bbb.append(BBB())
XXX Elements store at least one AAA and at least one BBB. So the
following code will fail:
xx = XXX()
print xx.toXml()
Each XMLObject is a DTD Element. As you may have noticed, some
XMLObjects have only one Node: mainNode. That means you can put
only one Node in it. But the object can still have Attributes. The
following example shows how XMLObjects w/ mainNode behave:
cc = CCC('Some Text')
print cc.toXml(headers=False)
print cc.mainNode
Now let's analyze BBB class which can store either strings
(#PCDATA) either CCC objects zero or many times (*). #PCDATA
is a special alternative handled by ChoiceNode. It means that
strings without enclosing tag can be inserted in ChoiceNode. The
following snippet shows how to use BBB:
bb2 = BBB()
bb2.append('This is')
bb2.append(CCC())
bb2.append('a combination')
bb2.append(CCC())
bb2.append('of')
bb2.append(CCC('CCC elements'))
bb2.append('and text')
bb2.append(CCC())
xxx.bbb.append(bb2)
xxx.bbb.append(BBB('Text only.'))
Because BBB can store many objects, it behaves as a list, just like
ListNode but it can store more than one XMLObject type.
Surprise, the XML output looks quite exactly like in the Zvon DTD
tutorial :-)
<?xml version="1.0" encoding="iso-8859-1" ?>
<xXX>
<aAA>
<cCC>
Precisely one element.
</cCC>
</aAA>
<aAA>
<bBB>
<cCC/>
<cCC/>
<cCC/>
</bBB>
</aAA>
<bBB/>
<bBB>
This is
<cCC/>
a combination
<cCC/>
of
<cCC>
CCC elements
</cCC>
and text
<cCC/>
</bBB>
<bBB>
Text only.
</bBB>
</xXX>
As you can see, one can design XMLObjects given a DTD fragment. The
following step is to automate this and provide DTD output and full
validation support.
Simply doing from XMLObject import * won't pollute much your
namespace. Though here are exported symbols detailled per module:
From XMLObject.main:
From XMLObject.Node:
From XMLObject.Nodes:
- ItemNode
- ListNode
- TextNode
- ChoiceNode
- RawNode
- CommentNode
From XMLObject.Attributes:
- CDATAttribute
- NMTokenAttribute
- NMTokensAttribute
- StringAttribute
- IntegerAttribute