# Scratch variables (was: io library and file parsing)

M

#### Mario de Sousa

Jiri Baum wrote:
>
> Hello,
>
> (I'm answering to the two issues separately)
>
> Mario de Sousa:
> > Issue 2
> > -------
>
> > For logic type modules, these will have many scratch variables that maybe
> > don't need to be interfaced to any other module. This raises the
> > 'philosophical' question of whether these variables should be mapped onto
> > linuxplc points.
>
> > Take the iec compiler for example. We can take the route of (a) having
> > every Mx.y 'variable' be a linuxplc point, which means they all have to
> > be configured, or (b) we can choose to assume that this memory is private
> > to the iec logic module, and only explicitly configured memory locations
> > would be mapped onto linuxplc points.
>
> I'm for option (a), on the grounds that even private variables should be
> visible to the debugging/tracing/etc tools.

Yes, I agree, but notice that these variables would only be visible once the module calls plc_update(). Won't this be too carse for some
debugging sessions?

> Otherwise there'll either have
> to be a separate debugging/tracing tool for every kind of logic module, or
> the option of putting them in the globalmap anyway. (...)
>
> I'm not sure to what extent this actually matters to the smm/gmm. It should
> be able to handle things anyway (for inter-module communication), and once
> it can, there's not much reason why it shouldn't.

Yes, it can handle both situations, but I think it needs to be optimised for either one. Currently each named linuxplc point takes up
48+36 bytes of configuration memory, and 1 to 32 _bits_ of user memory. This ratio is much too big if we are going to have every scratch variable as a named linuxplc point. Imagine needing 100K of
configuration memory just so we can have 1K of real memory!

Maybe we need to come back and consider arrays or structures of linuxplc points?

I don't know. I haven't really thought this through yet. I'm still considering the implications...

Mario.

----------------------------------------------------------------------------
Mario J. R. de Sousa
[email protected]
----------------------------------------------------------------------------

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

M

#### Mario de Sousa

Mario de Sousa wrote:
>
> Jiri Baum wrote:
> >
> > Hello,
> >
> > (I'm answering to the two issues separately)
> >
> > Mario de Sousa:
> > > Issue 2
> > > -------
> >
> > > For logic type modules, these will have many scratch variables that maybe
> > > don't need to be interfaced to any other module. This raises the
> > > 'philosophical' question of whether these variables should be mapped onto
> > > linuxplc points.
> >
> > > Take the iec compiler for example. We can take the route of (a) having
> > > every Mx.y 'variable' be a linuxplc point, which means they all have to
> > > be configured, or (b) we can choose to assume that this memory is private
> > > to the iec logic module, and only explicitly configured memory locations
> > > would be mapped onto linuxplc points.
> >

Jiri,

Cheers,

Mario.

----------------------------------------------------------------------------
Mario J. R. de Sousa
[email protected]
----------------------------------------------------------------------------

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

K

#### Ken E.

I am thinking sub-routines is going to be the issue. Steeplechase has a setup for subprogram variable passing to subprograms.

Reference - This is essentially global in the IO table.
RO Reference - This is the same as above with RO properties.
Reference Array - Same as the above two but for array data
Value - This is a local variable (probably stack variable). This is not viewable by watch windows as of yet.

The problem we have found in programming our machine code in Steeplechase VLC is that when we use the high level philosophy of modularity (by using re-useable subroutines) we have to manually add all the subprogram parameters for every hierarchy of subprograms (sometimes this can be half a dozen layers or more deep). I guess I am
saying this in the hopes for some kind of future inheritance ... but it goes along with the variable discussion.

My gut feel is to have everything in the IO table except for local subroutine variables. You really can't have locals in the IO table because they are dynamically allocated and may be recursive, etc etc. Configuration for local variables will be done in the logic interpreter/Module (Equates to defining a variable in your "C" function when translated).

~Ken

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

M

#### Mario de Sousa

Ken,

I have never used steeplechase, but I did take the time to visit their website to get an overview of their architecture. Mind you, they don't seem to have much info on their site, but from what I managed to gather it seems that when we have something equivalent to their
programs/subprograms/sub-subprograms etc... it will be entirely handled within the same logic linuxplc module, so all these variables can still
be local to that module. There is no need to make them public to other modules.

By using linuxplc points and synchronisation points, it already is possible to use a module as a subprogram of another module, but I wouldn't promote such use as it requires two context switches just to call a sub-program. Off-course, in this case the parameters would have to be passed using the global linuxplc point table.

Cheers,

Mario.

----------------------------------------------------------------------------
Mario J. R. de Sousa
[email protected]
----------------------------------------------------------------------------

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

J

#### Jiri Baum

Ken Emmons, Jr.:
> My gut feel is to have everything in the IO table except for local
> subroutine variables. You really can't have locals in the IO table
> because they are dynamically allocated and may be recursive, etc etc.

Yes, agreed.

There is, though, the issue about which variables should be considered "local to the module" - especially if it's a standard-compliant module and
the standard doesn't provide a suitable concept for this.

Jiri
--
Jiri Baum <[email protected]>
You know you've been hacking too long when ...
... reading a book you notice the word "From" at the beginning of a line.

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

J

#### Jiri Baum

> > Mario de Sousa:
> > > Take the iec compiler for example. We can take the route of (a)
> > > having every Mx.y 'variable' be a linuxplc point, which means they
> > > all have to be configured, or (b) we can choose to assume that this
> > > memory is private to the iec logic module, and only explicitly
> > > configured memory locations would be mapped onto linuxplc points.

Jiri Baum:
> > I'm for option (a), on the grounds that even private variables should
> > be visible to the debugging/tracing/etc tools.

Mario de Sousa:
> Yes, I agree, but notice that these variables would only be visible once
> the module calls plc_update(). Won't this be too carse for some debugging
> sessions?

True. Still, we might eventually stick a debugging tool into the gmm that allows the local map to be viewed in a uniform manner.

But even just the coarse view will oftentimes be very useful, and for more detailed debugging you want a display of the module internals anyway (eg a display of the stepladder).

> > I'm not sure to what extent this actually matters to the smm/gmm.
...
> Yes, it can handle both situations, but I think it needs to be optimised
> for either one. Currently each named linuxplc point takes up 48+36 bytes
> of configuration memory, and 1 to 32 _bits_ of user memory. This ratio
> is much too big if we are going to have every scratch variable as a named
> linuxplc point.

Actually, this ratio is probably much too big in any case... I haven't been looking at the confmap in much detail, I think you still know it better
than I do; is there any easy way to reduce this?

For instance, could all the point owners be listed in one place, with the point table just having one-byte references into the list?

With some care, this and the bit & length fields might be squashed into sixteen bits... With bit and length having a total of 529 combinations,
fitting into 10 bits, this would limit us to 64 distinct owners; is that enough?

> Maybe we need to come back and consider arrays or structures of linuxplc
> points?

Arrays - where they are actual arrays from all points of view, ie all elements have exactly the same properties - would definitely be useful
where they are needed. They probably won't be very hard to implement, but they don't really solve the above problem.

Jiri
--
Jiri Baum <[email protected]>
You know you've been hacking too long when ...
... reading a book you notice the word "From" at the beginning of a line.

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

M

#### Mario de Sousa

Jiri Baum wrote:
>
> > > Mario de Sousa:
> > > > Take the iec compiler for example. We can take the route of (a)
> > > > having every Mx.y 'variable' be a linuxplc point, which means they
> > > > all have to be configured, or (b) we can choose to assume that this
> > > > memory is private to the iec logic module, and only explicitly
> > > > configured memory locations would be mapped onto linuxplc points.
>
> Jiri Baum:
> > > I'm for option (a), on the grounds that even private variables should
> > > be visible to the debugging/tracing/etc tools.
>
> Mario de Sousa:
> > Yes, I agree, but notice that these variables would only be visible once
> > the module calls plc_update(). Won't this be too carse for some debugging
> > sessions?
>
> True. Still, we might eventually stick a debugging tool into the gmm that
> allows the local map to be viewed in a uniform manner.
>

Yes, good idea. Actually, if you use the --PLCisolate option the local map is already placed into shared memory. It would just be a question of
telling the debugging tool to look into _that_ shared memory instead of the shared memory being used for the global map. They use exactly the
same internal structure, so it would be transparent for the debugging tool.

> (...)
>
> > > I'm not sure to what extent this actually matters to the smm/gmm.
> ...
> > Yes, it can handle both situations, but I think it needs to be optimised
> > for either one. Currently each named linuxplc point takes up 48+36 bytes
> > of configuration memory, and 1 to 32 _bits_ of user memory. This ratio
> > is much too big if we are going to have every scratch variable as a named
> > linuxplc point.
>
> Actually, this ratio is probably much too big in any case... I haven't been
> looking at the confmap in much detail, I think you still know it better
> than I do; is there any easy way to reduce this?
>
> For instance, could all the point owners be listed in one place, with the
> point table just having one-byte references into the list?
>
> With some care, this and the bit & length fields might be squashed into
> sixteen bits... With bit and length having a total of 529 combinations,
> fitting into 10 bits, this would limit us to 64 distinct owners; is that
> enough?

We can probably optimize it quite a bit, but I think the most difficult will be squashing the name of the point itself (currently using 32 bytes
for a maximum of 31 char length name). If we keep this limit, it will probably be difficult to go any lower than say 40 bytes per point. Remember we still need the byte-offset:bit-offset:sizewner_ptr fields, which should probably take up 4 bytes if we consider a maximum size of 64*4Kbytes for the global memory (this is using 2 bytes for the byte-offset, 5 bits for the bit-offset, 5 bits for size, and 6 bits for the owner_ptr with a maximum of 64 distinct owners).

It probably won't be easy to reach those 40 bytes if we keep the current architecture. Currently the configuration memory manager (cmm)
is used to store both the linuxplc points and the synch points configurations. To do this, the cmm is actually just a list of variable sized chunks of memory, each chunk with a name (used for the point or synch point name) and a type (to distiguish between points and synch points). The remaining bytes of each chunk of memory is used internally by each library (the synch library, the gmm library, any future libraries...) to store their configuration data. For the gmm this is the byte-offset:bit-offset:size fields, along with the owner.
This architecture requires that each chunk of memory have an overhead of at least 6 bytes to maintain the linked list of cmm memory chunks,
and the size and type fields of each chunk. This brings the total down to 42 bytes per point. Actually the cmm works practically like a
specialised memory allocation function.

I think those 42 bytes is still a little large. Maybe we can go lower by using a variable number of bytes for the name instead of reserving 32
bytes. Like this the name would start at a certain position in the cmm memory chunk structure, and end when a '\0' is encountered. The rest of the info would continue right after the '\0' intead of at the end of the 32 byte char array. Like this we can easily support names larger than 31 chars, but the responsobility of using up a large amount of config memory would fall into the hands of the user. If they want a small config memory, then they must use shorter names.

Taking this idea still further, maybe we could use hashing for the names. I have nerver needed to use hashing myself, so I don't know how
safe it is...

> > Maybe we need to come back and consider arrays or structures of linuxplc
> > points?
>
> Arrays - where they are actual arrays from all points of view, ie all
> elements have exactly the same properties - would definitely be useful
> where they are needed. They probably won't be very hard to implement, but
> they don't really solve the above problem.
>

They would solve the problem of the iec M memory. We could have a single linuxplc named array typed point, that would have all the M
memory. Remember that the iec M memory (and the others too) are just an array of 16(?) bit integers that can also be accessed as bits. With
arrays we can define the whole memory with only one linuxplc point.

Granted, this means that only the iec module would have write access to this memory. That is the problem with arrays, that probbaly needs
discussing.

Mario.

----------------------------------------------------------------------------
Mario J. R. de Sousa
[email protected]
----------------------------------------------------------------------------

_______________________________________________
LinuxPLC mailing list
[email protected]

P

#### Phil Long

* * * > We can probably optimize it quite a bit, but I think the most difficult will be squashing the name of the point itself (currently using 32 bytes > for a maximum of 31 char length name)... * * * > Taking this idea still further, maybe we could use hashing for the names. I have nerver needed to use hashing myself, so I don't know how > safe it is... * * *Hashing is pretty safe, since conflicts, which are not at all unusual, can be resolved with a lookup into a linked list whose head lives in the hash table and is put there when a conflict is detected. In the last hashing application on which I worked (also the first ), I saw thirty or so hashing collisions' in a hashtable using an eleven-bit hashcode and 16 000 entries. Most such collisions were only between two entries, however, although I did have a few with more, and the largest had seven.Personally, with only 64 names, I wouldn't bother with hashing. Wouldn't it be possible to create an impromptu database' that uses one byte for a key (allowing 256 names) and 32 (or a variable number of) bytes of name string? In the data structure itself, only _one_ byte would be used for for the name, rather than 32. Each time a name is needed, the one-byte key' would be used. Each name would appear in string form in only one place: the database.' Thx, Phil Long Heidelberg Web Systems

M

#### Mario de Sousa

Mario de Sousa wrote:
>
> Jiri Baum wrote:
> >
> > > > Mario de Sousa:
> > > > > Take the iec compiler for example. We can take the route of (a)
> > > > > having every Mx.y 'variable' be a linuxplc point, which means they
> > > > > all have to be configured, or (b) we can choose to assume that this
> > > > > memory is private to the iec logic module, and only explicitly
> > > > > configured memory locations would be mapped onto linuxplc points.
> >
> > Jiri Baum:
> > > > I'm for option (a), on the grounds that even private variables should
> > > > be visible to the debugging/tracing/etc tools.
> >
> > (...)
> >
> > > > I'm not sure to what extent this actually matters to the smm/gmm.
> > ...
> > > Yes, it can handle both situations, but I think it needs to be optimised
> > > for either one. Currently each named linuxplc point takes up 48+36 bytes
> > > of configuration memory, and 1 to 32 _bits_ of user memory. This ratio
> > > is much too big if we are going to have every scratch variable as a named
> > > linuxplc point.
> >
> > Actually, this ratio is probably much too big in any case... I haven't been
> > looking at the confmap in much detail, I think you still know it better
> > than I do; is there any easy way to reduce this?
> > (...)
>
> We can probably optimize it quite a bit, but I think the most difficult
> will be squashing the name of the point itself (currently using 32 bytes
> for a maximum of 31 char length name).
> (... big snip)
>
> Taking this idea still further, maybe we could use hashing for the
> names. I have nerver needed to use hashing myself, so I don't know how
> safe it is...
>

I have been skimming through some perfect hashing schemes, and it seems to me that the hashing table ends up quite large, so hashing is probably not the way to go. It also seems that it would be difficult to produce perfect hashes of names of linuxPLC points that will probably be very similar to each other (equal lengths and differing in only one character).

Summing it up:
- if we are very agressive in optimizing the space required to store the config of a linuxPLC point (and this entails changing the internal
architecture), we can go down to 5 bytes + length of linuxPLC point name;
- if we maintain the current architecture we can go down to 12 bytes + length of linuxPLC point name.

But remember that the plc_pt_by_name(...) is supposed to be called on startup, where timing is not important. This means that if a LinuxPLC
point is going to be used, at least one module will have to create a plc_pt_t structure. This structure currently takes up an additional 12
bytes.

> > > Maybe we need to come back and consider arrays or structures of linuxplc
> > > points?
> >
> > Arrays - where they are actual arrays from all points of view, ie all
> > elements have exactly the same properties - would definitely be useful
> > where they are needed. They probably won't be very hard to implement, but
> > they don't really solve the above problem.
>
> They would solve the problem of the iec M memory. We could have a
> single linuxplc named array typed point, that would have all the M
> memory. (...)

If we are to stick to option a) (please see above), then I feel we will need arrays as a point will always end up taking quite a bit of memory.

Here is my proposal:

1) we optimize the memory usage but still keep the current internal architecture of the cmm.
2) we support arrays. This needs further discussion on if the current API will be able to support it (it will probably need extending), and
what semantics will be supported.
3) we change the plc_pt_t from a struct to a pointer to a struct. The pointer will point to a memory location inside the cmm. The cmm stores
the configuration of a LinuxPLC point info in a format that can be used directly by the gmm, instead of the current method of storing
information to build the plc_pt_t struct. This will allow supporting online changes that somehow change the linuxPLC point configurations. We
only need to update the memory inthe cmm, and all modules see the change simultaneously. This will require read/write locks on the cmm.
Unfortunately this change will require changing the code that checks the plc_pt_t.valid structure member, but all the rest should remain the same. This change will also allow saving some memory, as the plc_pt_t struct is reduced to a pointer.

What do you guys think of this?

Mario.

----------------------------------------------------------------------------
Mario J. R. de Sousa
[email protected]
----------------------------------------------------------------------------

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

R

#### Rick Price

On the topic of hashes, I used to have some code that would vary the hash value by at least one bit for every bit that changed in the input.

I'm not completely sure where it is, but I should be able to dig it up if required.

_______________________________________________
LinuxPLC mailing list
[email protected]
http://linuxplc.org/mailman/listinfo/linuxplc

J

#### Johan Bengtsson

use a sum of the characters as a hashing index. if that would not be good enough, use a simplified CRC or something similar. an idea: char hashIndex(const char *s) { short index=0; while (*s) { index<<=3; index^=index>>8; //using assembler could this be //optimised further... (x86) index^=*s++; } return (char)index; } /Johan Bengtsson ---------------------------------------- P&L, Innovation in training Box 252, S-281 23 H{ssleholm SWEDEN Tel: +46 451 49 460, Fax: +46 451 89 833 E-mail: [email protected] Internet: http://www.pol.se/ ---------------------------------------- _______________________________________________ LinuxPLC mailing list [email protected] http://linuxplc.org/mailman/listinfo/linuxplc

A

#### Amy Critchley

As for more info on the topic of hashing ... Dr. Dobb's (http://www.ddj.com/) has an article in the September 1997 issue with code listings and evaluations. For those of you who don't have the CD (I believe that article is one of the non-free ones), you can go directly to the web site of the author of the article for his code listings, hashing FAQ, etc. Bob Jenkins is the author, and the web site is http://burtleburtle.net/bob/. Hope this is useful! -------------------------------------- Amy Critchley - Software Engineer E-mail me at mailto:[email protected] -------------------------------------- _______________________________________________ LinuxPLC mailing list [email protected] http://linuxplc.org/mailman/listinfo/linuxplc