| Store | Cart

XML overuse? (was Re: Python to XML to Python conversion)

From: Huaiyu Zhu <hua...@gauss.almadan.ibm.com>
Thu, 18 Jul 2002 18:10:40 +0000 (UTC)
Clark C . Evans <cce at clarkevans.com> wrote:
>On Tue, Jul 16, 2002 at 10:14:51PM +0000, Huaiyu Zhu wrote:>| Thanks a lot for this link.  The basic idea is very similar, but apparently>| they have done a lot more of formal specification than I have ever>| attempted.  There are several differences in the details, so neither is>| superset of the other.  I'll comment on the differences once I have time to>| read through their docs.>>I look forward to the commentary, could you do it or cc the >YAML discussion list?

That'll be after I get time to read through YAML docs and review my old
code and docs.
    
>| The emphasis is on using indentation and leading markers to denote>| structure, in contrast to markups, puctuations, quotes and escapes in the>| markup languages.>>Exactly.   We started with leading markers (% and @ initially) and>eventually found ways that allowed us to skip these...

How like minds think alike. :-)   Perl opened my mind to the possibility of
heterogeneous hierarchical data structures.

>I'd love to hear about the overlap; I'm sure we don't do everything.>But if you found something important that we don't have, I'd love to>know since we'd like to start finalizing the spec at this time so that>implementations can start emerging.>>I'd love to hear more about your thoughts on YAML, and if possible,>we'd really welcome your participation!

I'll try to find time to participate, but time is always in short supply.  

Here are some comments at first glance.  I don't see a description of the
semantics of the structures independent of any syntax.  It is possible to
define all the canonical transforms among the structures [1] without
concerning any particular representation.  I'd also like to emphasize that
all the indentations, markers etc should be configurable in a document[2][3].

[1] Canonical transforms, such as {a, b, c} -> [a, b, c] -> {(1:a), (3:c),
	(2:b)}.  There are a few dozens of them among set, seq, dict, seqdict.
	Some have partial inverse.  None of them are one-one correspondence.
	That's why I let all these four as basic structures.  These four are the
	combination of keyd/nonkeyed ordered/unordered.  Additional kinds of
	structures, such as bags (whether keyed and whether ordered), may be
	added later on. [4]

[2] I tried the following kinds of indentations (where n is level)
	  '(%s)' % n
	  '  ' * n
	  '  ' * n + '|'
	Obviously there can be a lot of other variations.  Such flexibility
	would allow many common document formats to be transformed into
	conforming format with minimum effort, sometimes by just adding a
	metacomment at the beginning of the document.  For example, the formats
    of the current paragraphs should be accommodated.

[3] I would allow encoding and encryption to be allowed at a per node
	basis, not just at the file level.  In reality how to break up a tree
	into subtrees to fit in files is largely arbitrary.  This calls for meta
	comments on each node with a simple syntax for describing them.

[4] One thing I have not solved is whether the keys can only be strings.  If
	keys can be substructures themselves, there are further correspondence
	between sets, dicts and bags, such as {a, b} -> {a:1, b:1}.  This leads
	to the issue of the identity of structures.  Example: {a, b}=={a} if
	a==b.  This complicates things and that's perhaps where I stopped.
	(Over-generalization perhaps?)

So my overall comment is that this approach can be made more 'meta' than any
particular syntax or structure would allow.  The worst thing about xml is
that one has to conform to its (mostly arbitrary) syntax conventions instead
of thinking about the underlying data structure that's pertinent for the
task at hand.  I do believe that the good thing about standards is there are
so many to choose from.  A meta syntax would open up the possibility of
interoperability on a much larger scale than xml could handle comfortably.
It is often easier to define a particular syntax by fixing some parameters
in a meta syntax.  Perhaps these are already in yaml since I had only a half
hour reading of its docs.

Huaiyu

Recent Messages in this Thread
Mark Jul 12, 2002 12:08 am
Harry George Jul 12, 2002 12:01 am
theh...@binary.net Jul 12, 2002 01:22 am
Jeremy Bowers Jul 12, 2002 03:01 am
Peter Hansen Jul 12, 2002 03:41 am
Oren Tirosh Jul 12, 2002 05:54 am
Erik Max Francis Jul 12, 2002 08:16 am
Tim Rowe Jul 12, 2002 04:37 pm
François Pinard Jul 12, 2002 02:49 pm
Peter Hansen Jul 12, 2002 04:08 pm
Jeremy Bowers Jul 12, 2002 02:52 pm
Huaiyu Zhu Jul 12, 2002 05:45 pm
holger krekel Jul 13, 2002 02:55 pm
Huaiyu Zhu Jul 15, 2002 06:04 pm
holger krekel Jul 16, 2002 01:18 pm
Clark C . Evans Jul 16, 2002 06:27 pm
Steve Howell Jul 16, 2002 06:31 pm
Huaiyu Zhu Jul 16, 2002 10:14 pm
Clark C . Evans Jul 17, 2002 01:30 am
James Kew Jul 17, 2002 10:52 pm
François Pinard Jul 18, 2002 11:11 am
Huaiyu Zhu Jul 18, 2002 06:10 pm
Huaiyu Zhu Jul 18, 2002 06:36 pm
Matt Gerrans Jul 12, 2002 07:03 am
Jonathan Hogg Jul 12, 2002 08:07 am
Clark C . Evans Jul 13, 2002 12:29 am
Alex Martelli Jul 12, 2002 08:01 am
Jeremy Bowers Jul 12, 2002 02:55 pm
Jeremy Bowers Jul 12, 2002 03:53 pm
Terry Reedy Jul 12, 2002 02:03 am
David Mertz, Ph.D. Jul 15, 2002 02:49 pm
Alex Martelli Jul 12, 2002 02:03 pm
Jonathan Hogg Jul 12, 2002 03:11 pm
Cameron Laird Jul 12, 2002 01:35 pm
Cameron Laird Jul 12, 2002 01:27 pm
Alex Martelli Jul 12, 2002 08:37 am
Doru-Catalin Togea Jul 12, 2002 09:43 am
Jonathan Hogg Jul 12, 2002 01:44 pm
François Pinard Jul 12, 2002 02:37 pm
Mike C. Fletcher Jul 12, 2002 04:56 pm
Jeremy Bowers Jul 12, 2002 03:00 pm
Tim Rowe Jul 12, 2002 04:37 pm
Messages in this thread

Previous post: raw_input