XML

> In any case, XML has a certain model of what data looks like, regardless
of
> the actual tags defined. All data is hierarchical, for instance. There's
> nothing you can do about it, it is intrinsic to XML, a part of the data
> model which defined by level 2.

Specifically, an XML document is a tree of nodes (thus the hierarchy).

So yes, XML does impose a tree structure on the data representation. Fortunately, a large number of data models are representable via trees (for instance, a list is simply a VERY flat tree with every node a leaf).

FYI,

Zach Frey

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
O

Oscar Esteban

> the actual tags defined. All data is hierarchical, for instance. There's
> nothing you can do about it, it is intrinsic to XML, a part of the data
> model which defined by level 2.

Well, yes, and no. It bends well to hierarchical models, but you can model non hierarchical data. I'm thinking, for instance, in a closed directed graph (probably my example is not quite XML compliant, I beg your pardon, and the parsers'):

<GRAPH>
<NODE>
Name=N1
<LINK>Name=N2</LINK>
<LINK>Name=N3</LINK>
</NODE>
<NODE>
Name=N2
<LINK>Name=N1</LINK>
</NODE>
<NODE>
Name=N3
<LINK>Name=N2</LINK>
</NODE>
<NODE>
Name=N4
<LINK>Name=N2</LINK>
<LINK>Name=N3</LINK>
<LINK>Name=N4</LINK>
</NODE>
</GRAPH>

It could be said that we have father-to-son relationships (NODE-LINK) and also siblings (NODE-NODE). But I can't think right now of a
less hierarchical example which is harder to represent.

> I'm not aware of any such protocol for XML.
>
> > This is the same as with a database (whether SQL or not).
>
> Except that an SQL database is designed with multiple users in mind, all
> modifying the database concurrently. I'm don't think XML provides that,
and

That's not the work for XML, but for a DB server. We could use the XML above and then let an equipment ask things like "select * from graph
where NodeName = 'N1'". XML determines how some data is represented, but it doesn't provide an access standard. SQL, on the contrary, specifies how you can access your data, but dictates nothing on low level representation.


_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
Jiri Baum:
> > the actual tags defined. All data is hierarchical, for instance.
> > There's nothing you can do about it, it is intrinsic to XML, a part of
> > the data model which defined by level 2.

Oscar Esteban:
> Well, yes, and no. It bends well to hierarchical models, but you can
> model non hierarchical data. I'm thinking, for instance, in a closed
> directed graph (probably my example is not quite XML compliant, I beg
> your pardon, and the parsers'):
[snip]

Yes - but that's not really a good representation of a directed graph.

(It may be the best possible, and it may be fairly close to the mathematicians' definition, but it's not really a good representation.)

For instance, there is the problem of dangling links. Does XML (or the associated experience) provide techniques for dealing with that problem?

Your example is in some ways exactly the kind of thing I would hope to avoid: data placed in an XML format not because it is a good fit, but
simply for the sake of using XML.

> But I can't think right now of a less hierarchical example which is
> harder to represent.

Any kind of multidimensional data. To fit into a hierarchical system, it must be somehow factored into a tree. This can be done by dimension, by
deciding that one or another dimension is the most important, or in some other way, but it will never be a natural fit.

Since much real-world data is multidimensional, this provides a wide variety of examples.

Consider a library. The books are first divided into fiction and nonfiction; fiction is then sorted by author, while non-fiction is
organized hierarchically by topic (the Dewey decimal system).

Of course, there may be more than one author, and a book may span more than one topic. (In fact, the sub-sub-topic of Computer Science belongs to two different topics, so its books are in two very different places.) Some fiction books are better sorted by topic too, such as Star Trek or Star Wars books, and some libraries do that, as an exception to the Author rule.

If the library stores serials and monograms separately, you can toss a coin whether conference proceedings will be one or the other.

Where *The Science of Discworld* belongs, with its alternating chapters of wizards (fiction) and science popularization (non-fiction), I've no idea.


Consider a text with various tags (formatting, change-tracking, hypertext etc). These tag types are in many cases orthogonal and there's no reason
why they shouldn't be allowed to overlap arbitrarily, yet XML forbids it. One cannot say "&lt;latin>X. &lt;change>plugh&lt;/latin> (previously <latin>X.
xyzzy&lt;/latin>)&lt;/change>" - one must either break up the species name or turn the single change into two separate changes.


The linuxplc.conf file is organized by section, which means that the definition of a point and its mapping to a physical input or output(s) are
separated. In an inversion of the hierarchy, we could have all information pertaining to a point in one place, defining it and immediately listing its physical input and/or output(s), and perhaps its display in the MMI.

This would be sensible - and in fact may be a good way of writing linuxplc.conf files; the current format allows it.

We can extend this approach to the logic program: rather than listing the rungs in order and specifying which points play what part in them, we might well list all the points in order and specify which rungs they play what part in. This sounds ridiculous, yet a variant of it may well be sensible.

> > I'm not aware of any such protocol for XML.

> > > This is the same as with a database (whether SQL or not).

> > Except that an SQL database is designed with multiple users in mind,
> > all modifying the database concurrently. I'm don't think XML provides

> That's not the work for XML, but for a DB server.

Of course, but when choosing a data representation one of the factors should be what kinds of services are available for it. The relational database format has a whole slew of mature, well-developed and largely well-understood services of that kind.

They are not intrinsic to the relational database representation; it just so happens that - for whatever reasons - they are available.

> We could use the XML above and then let an equipment ask things like
> "select * from graph where NodeName = 'N1'". XML determines how some data
> is represented, but it doesn't provide an access standard. SQL, on the
> contrary, specifies how you can access your data, but dictates nothing on
> low level representation.

Actually, SQL has a pretty good idea of what your data has to look like. It doesn't care about the low-level representation, but it must directly
translate into the standard tabular form.

I don't know, maybe there's a bastardized version of SQL that can handle generic XML documents, very likely there's an XML definition that can be
used to store SQL tables (inefficiently). But SQL itself pre-supposes so much about your data that trying to apply it to XML is a case of putting
the cart before the horse, a clear case of drowning in the alphabet soup.

SQL is just a very wrong language to use for XML.


Jiri
--
Jiri Baum <[email protected]>
What we do Every Night! Take Over the World! Step 1 - bid for SMOFcon

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
B
>2. A large XML document will require a lot of processing. A PC104
>based system will certainly fail. I use a dual 650 machine and it takes
>almost 2 seconds for a document to get parsed (I am using the IBM
>parser at present).

Two seconds per document on a modern PC!!! Boy, I swear computers get slower every year. How large is this document? Isn't it just processing
text? Why so slow?

Bill Sturm


_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
I'll respond to the rest of the post later (it's half past one in the morning), but I wanted to ask about one thing.

Ken Irving wrote (an XML example):
> &lt;?xml version="1.0" ?>
> &lt;LinuxPLC>
> &lt;SMM sem_key="42">
...

Does XML (or something) provide guidance about what XML structures should be used for what?

Take the last line above, for instance:

&lt;SMM sem_key="42">
&lt;SMM>&lt;sem_key>42&lt;/sem_key>
&lt;SMM>&lt;sem_key value="42"/>
&lt;SMM>&lt;var name="sem_key">42&lt;/var>
&lt;SMM>&lt;var name="sem_key" value="42"/>
&lt;SMM>&lt;var>&lt;name>sem_key&lt;/name>&lt;value>42&lt;/value>&lt;/var>
&lt;SMM>&lt;IPC semid="42" globalmap="86" confmap="99"/>

How would one choose between these seven representations? For that matter, which of them is best?


Jiri
--
Jiri Baum <[email protected]>
What we do Every Night! Take Over the World! Step 1 - bid for SMOFcon

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
K
On Thu, Sep 07, 2000 at 01:48:12AM +1100, Jiri Baum wrote:
>
> Ken Irving wrote (an XML example):
> > &lt;?xml version="1.0" ?>
> > &lt;LinuxPLC>
> > &lt;SMM sem_key="42">
> ...
>
> Does XML (or something) provide guidance about what XML structures should
> be used for what?

No, XML just defines what is well-formed, the rest is up to the language developer.

> Take the last line above, for instance:
>
> &lt;SMM sem_key="42">
> &lt;SMM>&lt;sem_key>42&lt;/sem_key>
> &lt;SMM>&lt;sem_key value="42"/>
> &lt;SMM>&lt;var name="sem_key">42&lt;/var>
> &lt;SMM>&lt;var name="sem_key" value="42"/>
> &lt;SMM>&lt;var>&lt;name>sem_key&lt;/name>&lt;value>42&lt;/value>&lt;/var>
> &lt;SMM>&lt;IPC semid="42" globalmap="86" confmap="99"/>
>
> How would one choose between these seven representations? For that matter,
> which of them is best?

Given that all are equivalent, this is practically a "religious issue", and you'd probably get 7 different opinions from 7 people. Some criteria could include: what maps best to underlying data structures, what might be processed or accessed most efficiently by available tools, what maps best to other tools or protocols that might be involved, which might be
easiest to read/edit manually, ...

The classic question is "element vs attribute", and many discussions can be found on that topic. This page seems to cover the issue pretty well, with a number of links to opinions and discussion:

http://www.oasis-open.org/cover/elementsAndAttrs.html

In the project's demo/basic/linuxplc.conf file, from which the example data was copied, it says:

# The SMM has four configuration settings:
# globalmap_key, confmap_pg, globalmap_pg, sem_key

which could suggest that those four items might rate their own attribute name (if there's only one set of values for each SMM section), or element name (perhaps if the value might change as the section is processed). Your 4th example,

&lt;SMM>&lt;var name="sem_key">42&lt;/var>

treats the parameter name as data, which might arguably be inappropriate, or maybe overly generic.

After adding &lt;/SMM> closing tags and enclosing in a primary element, a simple tool to view the structure of the XML yields the following for the above set of representations:

LinuxPLC()
SMM(sem_key,)
SMM()
sem_key()
SMM()
sem_key(value,)
SMM()
var(name,)
SMM()
var(name,value,)
SMM()
var()
name()
value()
SMM()
IPC(semid,confmap,globalmap,)

where each element is on a separate line, indented, and with attributes listed in parentheses.

Some of the representations seem to be more "informative", since the application's parameter names are visible, while others seem more generic.

Just as a final point, perhaps to show that "anything is possible", another (questionable) option might be to enclose a preexisting file
format inside an XML element (some special characters might need escaping), e.g., to transport the data in an XML envelope. Probably
little to be gained this way, but this also would be well-formed, and therefore "correct" XML:

&lt;?xml version="1.0" ?>
&lt;linuxplc>
# shared memory map section

[SMM]
# The SMM has four configuration settings:
# globalmap_key, confmap_pg, globalmap_pg, sem_key

sem_key = 42

# P O I N T S
# -----------

point L1 "light 1" Chaser at 0.0
point L2 "light 2" Chaser at 0.1
point L3 "light 3" Chaser at 0.2
point L4 "light 4" Chaser at 0.3
point left "&lt;- (key L)" Kbd at 1.4
point right "-> (key R)" Kbd at 1.5
point quit "quit (key Q)" Kbd at 1.6
&lt;/linuxplc>

With the aforementioned tool, this has the structure:

linuxplc()

And the content of that element (or its "character data") would be the original file format (with escaped entiities converted back to their native form).

--
Ken Irving
Trident Software
[email protected]
 
On Wed, Sep 06, 2000 at 10:30:08AM -0800, Ken Irving wrote:
> On Thu, Sep 07, 2000 at 01:48:12AM +1100, Jiri Baum wrote:

> > Ken Irving wrote (an XML example):
> > > <?xml version="1.0" ?>
> > > <LinuxPLC>
> > > <SMM sem_key="42">
> > ...

> > Does XML (or something) provide guidance about what XML structures
> > should be used for what?

> No, XML just defines what is well-formed, the rest is up to the language
> developer.

Is there something else that would provide guidance about what XML structures should be used for what?

> > Take the last line above, for instance:

> > &lt;SMM sem_key="42">
> > &lt;SMM>&lt;sem_key>42&lt;/sem_key>
> > &lt;SMM>&lt;sem_key value="42"/>
> > &lt;SMM>&lt;var name="sem_key">42</var>
> > &lt;SMM>&lt;var name="sem_key" value="42"/>
> > &lt;SMM>&lt;var>&lt;name>sem_key&lt;/name>&lt;value>42&lt;/value>&lt;/var>
> > &lt;SMM>&lt;IPC semid="42" globalmap="86" confmap="99"/>

> > How would one choose between these seven representations? For that
> > matter, which of them is best?

> Given that all are equivalent, this is practically a "religious issue",
> and you'd probably get 7 different opinions from 7 people.

That sounds like a recipe for disaster.

> Some criteria could include: what maps best to underlying data
> structures, what might be processed or accessed most efficiently by
> available tools, what maps best to other tools or protocols that might be
> involved, which might be easiest to read/edit manually, ...

...
> In the project's demo/basic/linuxplc.conf file, from which the example
> data was copied, it says:

> # The SMM has four configuration settings:
> # globalmap_key, confmap_pg, globalmap_pg, sem_key

> which could suggest that those four items might rate their own attribute
> name (if there's only one set of values for each SMM section), or element
> name (perhaps if the value might change as the section is processed).

There's only one set of those values in the SMM section.

But some sections may have dozens of settings, and the number of possible settings (in all sections together) will be large. Should all of them be defined as elements or attributes?

...
> Just as a final point, perhaps to show that "anything is possible",

Hmm, approaches which take the view that "anything is possible" sound suspicious.


Jiri
--
Jiri Baum <[email protected]>
What we do Every Night! Take Over the World! Step 1 - bid for SMOFcon

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
On Wed, Sep 06, 2000 at 05:08:23AM -0800, Ken Irving wrote:
> On Wed, Sep 06, 2000 at 10:54:24PM +1100, Jiri Baum wrote:
> > Jiri Baum:
> > > > I meant the other way - when some other program wishes to change
> > > > the data. Relational databases have well-defined, time-proven
> > > > semantics for multiple access (with well-known problems).

> > > > I'm not aware of any such protocol for XML.

> > Ken Irving:
> > > That's not the domain of XML, which is simply a file (or stream)
> > > format.

> > Exactly.

> > Is such a protocol available for XML? (It doesn't have to be
> > implemented; at this stage I just want to know whether it exists.)

> AFAIK, there are probably several "XML databases" in the works or
> available, commercially [1] if not GPLed, but IMHO if a database is what
> you want, then use a database.

Exactly.

Do we want a database?

> XML might be a way to stuff information into and out of databases, but
> there's nothing inherent in XML that deals with multiple access or
> interactive use.

OK.

> In plain vanilla XML, there's XPath, essentially a query language, but
> it's (mostly) in the reading direction, AFAIK.

OK.


Jiri
--
Jiri Baum <[email protected]>
What we do Every Night! Take Over the World! Step 1 - bid for SMOFcon

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
On Wed, Sep 06, 2000 at 08:22:08AM -0400, Zach Frey wrote:

> > In any case, XML has a certain model of what data looks like,
> > regardless of the actual tags defined. All data is hierarchical, for
> > instance. There's nothing you can do about it, it is intrinsic to XML,
> > a part of the data model which defined by level 2.

> Specifically, an XML document is a tree of nodes (thus the hierarchy).

> So yes, XML does impose a tree structure on the data representation.
> Fortunately, a large number of data models are representable via trees
> (for instance, a list is simply a VERY flat tree with every node a leaf).

Yes, a list is a degenerate tree - but that's a special example.

There are at least two kinds of examples that don't fit into a tree structure very well:

- graphs (directed or undirected). I'm not sure if there is a
really good way of representing graphs, but XML doesn't even try.

- multidimensional data. This fits reasonably well into a database
table. To fit it into a tree, we usually have to denote one of
the dimensions "major" (more or less arbitrarily) and factor the
data on that, then recurse.

Alternately, we can represent the data in a database table, which
is then turned into a list and stored flat. Thus defeating the
entire purpose of the exercise.

Jiri
--
Jiri Baum <[email protected]>
What we do Every Night! Take Over the World! Step 1 - bid for SMOFcon

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
I have been watching this discussion to go on for a while. I have worked with XML for a long while and I was skeptic at start, but now I think it will be a good fit for what we want. To answer your question in ref to databases, XML is primarily used for backend transfers to WEB browser (with a parser in the middle). It works fine with a database model as at the end of the day you are only dealing with ROWS and COLUMNS.

I think some are beginning to think that you can write an XML document any which way you like as long as you comply to the mark up language. This is true, but it is wrong to say that we write our document any which way we like and the parser will
magically take care of it.

If we decide that XML is the right approach, then we have to decide on the TAGS and ATTRIBUTES and what format we want to keep it. The example that Jiri gave, they were all correct, but the last line looked the best as it carries the least processing (One Tag and all ATTRIBUTES are grouped).

This is somehow flexible and can be coded for (Yes we still have to code for the parser - that's where the rules are needed).

I hope this helps.
Thanks
Bob-

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
C

Curt Wuollet

Hi All

I have been following the discussion loosely. Here's my opinion FWIW. Based on what I'm hearing.

XML would give us a standard way of coding config files. This would be consistant with the rest of the computing world. (or, at least the leading edge )
Very few people on the list have substantial knowlege to add to the discussion or very few care. There are some who care a lot, for reasons that seem a bit obscure in relation to lplc.
Since config files are typically simple assignments, to add them later would have about the same cost as now.
Solving the special case by coding parsers for what we want in a config file is not all that difficult and they are small and understood relatively easily by most. Solving the general case by including an XML parser from someone else opens a whole can of worms not particularly
related to development of lplc and takes parsing outside of what I perceive to be our depth and they are larger. I'm still waiting on the benefits.
While they are a wonderful topic for the CS types in our midst, I still don't see the problem they solve for data that is likely to be highly specific, even to a particular installation
and fairly unlikely to be exchanged or shared off the machine.
I don't feel like learning another markup language to do config files, I'm not sure that makes sense. Call me pragmatic or ignorant as you wish.

But, as always, here's my killer argument. If you want them in there, go ahead and write the code. If it's done and it works, that's probably what we'll use until some one writes better code. This is the ultimate resolution anyway.

Regards

cww

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
Hi,

Before using XML I was just hacking together a file format by hand. It started to get ugly when I added the configuration for the IO scanner. The IO devices are much more varied between device.

The XML structure did force some integrity. In particular you can develop a document definition format (Document Type Definition - DTD) that includes argument types, hierarchy, allowed repititions, etc. With XML you can parse it with software that will check the document for basic layout and syntax (non-validating), or you can do full document checking that verifies content also (validating). On the FreeLC (the PLC) I have written my own parser that does the final validation, but on the programming software and other client programs a validating parser makes a lot of sense, and will add another layer of defense against user program bugs. This format also means that if you want to construct PLC
files by hand you can use an XML editor, and be more sure that it is correct.

Hugh


_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc
 
Top