Stone − In−memory storage for hierarchical tag/value data structures
use Stone; my $stone = Stone−>new( Jim => { First_name => 'James', Last_name => 'Hill', Age => 34, Address => { Street => ['The Manse', '19 Chestnut Ln'], City => 'Garden City', State => 'NY', Zip => 11291 } }, Sally => { First_name => 'Sarah', Last_name => 'James', Age => 30, Address => { Street => 'Hickory Street', City => 'Katonah', State => 'NY', Zip => 10578 } } ); @tags = $stone−>tags; # yields ('James','Sally'); $address = $stone−>Jim−>Address; # gets the address subtree @street = $address−>Street; # yeilds ('The Manse','19 Chestnut Ln') $address = $stone−>get('Jim')−>get('Address'); # same as $stone−>Jim−>Address $address = $stone−>get('Jim.Address'); # another way to express same thing # first Street tag in Jim's address $address = $stone−>get('Jim.Address.Street[0]'); # second Street tag in Jim's address $address = $stone−>get('Jim.Address.Street[1]'); # last Street tag in Jim's address $address = $stone−>get('Jim.Address.Street[#]'); # insert a tag/value pair $stone−>insert(Martha => { First_name => 'Martha', Last_name => 'Steward'} ); # find the first Address $stone−>search('Address'); # change an existing subtree $martha = $stone−>Martha; $martha−>replace(Last_name => 'Stewart'); # replace a value # iterate over the tree with a cursor $cursor = $stone−>cursor; while (my ($key,$value) = $cursor−>each) { print "$value: Go Bluejays!\n" if $key eq 'State' and $value eq 'Katonah'; } # various format conversions print $stone−>asTable; print $stone−>asString; print $stone−>asHTML; print $stone−>asXML('Person');
A Stone consists of a series of tag/value pairs. Any given tag may be single-valued or multivalued. A value can be another Stone, allowing nested components. A big Stone can be made up of a lot of little stones (pebbles?). You can obtain a Stone from a Boulder::Stream or Boulder::Store persistent database. Alternatively you can build your own Stones bit by bit.
Stones can be exported into string, XML and HTML representations. In addition, they are flattened into a linearized representation when reading from or writing to a Boulder::Stream or one of its descendents.
Stone was designed for subclassing. You should be able to create subclasses which create or require particular tags and data formats. Currently only Stone::GB_Sequence subclasses Stone.
Stones are either created by calling the new() method, or by reading them from a Boulder::Stream or persistent database.
$stone = Stone−>new()
This is the main constructor for the Stone class. It can be called without any parameters, in which case it creates an empty Stone object (no tags or values), or it may passed an associative array in order to initialize it with a set of tags. A tag’s value may be a scalar, an anonymous array reference (constructed using [] brackets), or a hash references (constructed using {} brackets). In the first case, the tag will be single-valued. In the second, the tag will be multivalued. In the third case, a subsidiary Stone will be generated automatically and placed into the tree at the specified location.
Examples:
$myStone = new Stone; $myStone = new Stone(Name=>'Fred',Age=>30); $myStone = new Stone(Name=>'Fred', Friend=>['Jill','John','Jerry']); $myStone = new Stone(Name=>'Fred', Friend=>['Jill', 'John', 'Gerald' ], Attributes => { Hair => 'blonde', Eyes => 'blue' } );
In the last example, a Stone with the following structure is created:
Name Fred Friend Jill Friend John Friend Gerald Attributes Eyes blue Hair blonde
Note that the value corresponding to the tag "Attributes" is itself a Stone with two tags, "Eyes" and "Hair".
The XML representation (which could be created with asXML()) looks like this:
<?xml version="1.0" standalone="yes"?> <Stone> <Attributes> <Eyes>blue</Eyes> <Hair>blonde</Hair> </Attributes> <Friend>Jill</Friend> <Friend>John</Friend> <Friend>Gerald</Friend> <Name>Fred</Name> </Stone>
More information on Stone initialization is given in the description of the insert() method.
Once a Stone object is created or retrieved, you can manipulate it with the following methods.
$stone−>insert(%hash)
$stone−>insert(\%hash)
This is the main method for adding tags to a Stone. This method expects an associative array as an argument or a reference to one. The contents of the associative array will be inserted into the Stone. If a particular tag is already present in the Stone, the tag’s current value will be appended to the list of values for that tag. Several types of values are legal:
• |
A scalar value |
The value will be inserted into the "Stone".
$stone−>insert(name=>Fred, age=>30, sex=>M); $stone−>dump; name[0]=Fred age[0]=30 sex[0]=M
• |
An ARRAY reference |
A multi-valued tag will be created:
$stone−>insert(name=>Fred, children=>[Tom,Mary,Angelique]); $stone−>dump; name[0]=Fred children[0]=Tom children[1]=Mary children[2]=Angelique
• |
A HASH reference |
A subsidiary "Stone" object will be created and inserted into the object as a nested structure.
$stone−>insert(name=>Fred, wife=>{name=>Agnes,age=>40}); $stone−>dump; name[0]=Fred wife[0].name[0]=Agnes wife[0].age[0]=40
• |
A "Stone" object or subclass |
The "Stone" object will be inserted into the object as a nested structure.
$wife = new Stone(name=>agnes, age=>40); $husband = new Stone; $husband−>insert(name=>fred, wife=>$wife); $husband−>dump; name[0]=fred wife[0].name[0]=agnes wife[0].age[0]=40
$stone−>replace(%hash)
$stone−>replace(\%hash)
The replace() method behaves exactly like "insert()" with the exception that if the indicated key already exists in the Stone, its value will be replaced. Use replace() when you want to enforce a single-valued tag/value relationship.
$stone−>insert_list($key,@list) =head2 $stone−>insert_hash($key,%hash) =head2 $stone−>replace_list($key,@list) =head2 $stone−>replace_hash($key,%hash)
These are primitives used by the "insert()" and "replace()" methods. Override them if you need to modify the default behavior.
$stone−>delete($tag)
This removes the indicated tag from the Stone.
@values = $stone−>get($tag [,$index])
This returns the value at the indicated tag and optional index. What you get depends on whether it is called in a scalar or list context. In a list context, you will receive all the values for that tag. You may receive a list of scalar values or (for a nested record) or a list of Stone objects. If called in a scalar context, you will either receive the first or the last member of the list of values assigned to the tag. Which one you receive depends on the value of the package variable $Stone::Fetchlast. If undefined, you will receive the first member of the list. If nonzero, you will receive the last member.
You may provide an optional index in order to force get() to return a particular member of the list. Provide a 0 to return the first member of the list, or ’#’ to obtain the last member.
If the tag contains a period (.), get() will call index() on your behalf (see below).
If the tag begins with an uppercase letter, then you can use the autogenerated method to access it:
$stone−>Tag_name([$index])
This is exactly equivalent to:
$stone−>get('Teg_name' [,$index])
@values = $stone−>search($tag)
Searches for the first occurrence of the tag, traversing the tree in a breadth-first manner, and returns it. This allows you to retrieve the value of a tag in a deeply nested structure without worrying about all the intermediate nodes. For example:
$myStone = new Stone(Name=>'Fred', Friend=>['Jill', 'John', 'Gerald' ], Attributes => { Hair => 'blonde', Eyes => 'blue' } ); $hair_colour = $stone−>search('Hair');
The disadvantage of this is that if there is a tag named "Hair" higher in the hierarchy, this tag will be retrieved rather than the lower one. In an array context this method returns the complete list of values from the matching tag. In a scalar context, it returns either the first or the last value of multivalued tags depending as usual on the value of $Stone::Fetchlast.
$Stone::Fetchlast is also consulted during the depth-first traversal. If $Fetchlast is set to a true value, multivalued intermediate tags will be searched from the last to the first rather than the first to the last.
The Stone object has an AUTOLOAD method that invokes get() when you call a method that is not predefined. This allows a very convenient type of shortcut:
$name = $stone−>Name; @friends = $stone−>Friend; $eye_color = $stone−>Attributes−>Eyes
In the first example, we retrieve the value of the top-level tag Name. In the second example, we retrieve the value of the Friend tag.. In the third example, we retrieve the attributes stone first, then the Eyes value.
NOTE: By convention, methods are only autogenerated for tags that begin with capital letters. This is necessary to avoid conflict with hard-coded methods, all of which are lower case.
@values = $stone−>index($indexstr)
You can access the contents of even deeply-nested Stone objects with the "index" method. You provide a tag path, and receive a value or list of values back.
Tag paths look like this:
tag1[index1].tag2[index2].tag3[index3]
Numbers in square brackets indicate which member of a multivalued tag you’re interested in getting. You can leave the square brackets out in order to return just the first or the last tag of that name, in a scalar context (depending on the setting of $Stone::Fetchlast). In an array context, leaving the square brackets out will return all multivalued members for each tag along the path.
You will get a scalar value in a scalar context and an array value in an array context following the same rules as get(). You can provide an index of ’#’ in order to get the last member of a list or a [?] to obtain a randomly chosen member of the list (this uses the rand() call, so be sure to call srand() at the beginning of your program in order to get different sequences of pseudorandom numbers. If there is no tag by that name, you will receive undef or an empty list. If the tag points to a subrecord, you will receive a Stone object.
Examples:
# Here's what the data structure looks like. $s−>insert(person=>{name=>Fred, age=>30, pets=>[Fido,Rex,Lassie], children=>[Tom,Mary]}, person=>{name=>Harry, age=>23, pets=>[Rover,Spot]}); # Return all of Fred's children @children = $s−>index('person[0].children'); # Return Harry's last pet $pet = $s−>index('person[1].pets[#]'); # Return first person's first child $child = $s−>index('person.children'); # Return children of all person's @children = $s−>index('person.children'); # Return last person's last pet $Stone::Fetchlast++; $pet = $s−>index('person.pets'); # Return any pet from any person $pet = $s−>index('person[?].pet[?]');
Note that index() may return a Stone object if the tag path points to a subrecord.
$array = $stone−>at($tag)
This returns an ARRAY REFERENCE for the tag. It is useful to prevent automatic dereferencing. Use with care. It is equivalent to:
$stone−>{'tag'}
at() will always return an array reference. Single-valued tags will return a reference to an array of size 1.
@tags = $stone−>tags()
Return all the tags in the Stone. You can then use this list with get() to retrieve values or recursively traverse the stone.
$string = $stone−>asTable()
Return the data structure as a tab-delimited table suitable for printing.
$string = $stone−>asXML([$tagname])
Return the data structure in XML format. The entire data structure will be placed inside a top-level tag called <Stone>. If you wish to change this top-level tag, pass it as an argument to asXML().
An example follows:
print $stone−>asXML('Address_list'); # yields: <?xml version="1.0" standalone="yes"?> <Address_list> <Sally> <Address> <Zip>10578</Zip> <City>Katonah</City> <Street>Hickory Street</Street> <State>NY</State> </Address> <Last_name>Smith</Last_name> <Age>30</Age> <First_name>Sarah</First_name> </Sally> <Jim> <Address> <Zip>11291</Zip> <City>Garden City</City> <Street>The Manse</Street> <Street>19 Chestnut Ln</Street> <State>NY</State> </Address> <Last_name>Hill</Last_name> <Age>34</Age> <First_name>James</First_name> </Jim> </Address_list>
$hash = $stone−>attributes([$att_name, [$att_value]]])
attributes() returns the "attributes" of a tag. Attributes are a series of unique tag/value pairs which are associated with a tag, but are not contained within it. Attributes can only be expressed in the XML representation of a Stone:
<Sally id="sally_tate" version="2.0"> <Address type="postal"> <Zip>10578</Zip> <City>Katonah</City> <Street>Hickory Street</Street> <State>NY</State> </Address> </Sally>
Called with no arguments, attributes() returns the current attributes as a hash ref:
my $att = $stone−>Address−>attributes; my $type = $att−>{type};
Called with a single argument, attributes() returns the value of the named attribute, or undef if not defined:
my $type = $stone−>Address−>attributes('type');
Called with two arguments, attributes() sets the named attribute:
my $type = $stone−>Address−>attributes(type => 'Rural Free Delivery');
You may also change all attributes in one fell swoop by passing a hash reference as the single argument:
$stone−>attributes({id=>'Sally Mae',version=>'2.1'});
$string = $stone−>toString()
toString() returns a simple version of the Stone that shows just the topmost tags and the number of each type of tag. For example:
print $stone−>Jim−>Address; #yields => Zip(1),City(1),Street(2),State(1)
This method is used internally for string interpolation. If you try to print or otherwise manipulate a Stone object as a string, you will obtain this type of string as a result.
$string = $stone−>asHTML([\&callback])
Return the data structure as a nicely-formatted HTML 3.2 table, suitable for display in a Web browser. You may pass this method a callback routine which will be called for every tag/value pair in the object. It will be passed a two-item list containing the current tag and value. It can make any modifications it likes and return the modified tag and value as a return result. You can use this to modify tags or values on the fly, for example to turn them into HTML links.
For example, this code fragment will turn all tags named "Sequence" blue:
my $callback = sub { my ($tag,$value) = @_; return ($tag,$value) unless $tag eq 'Sequence'; return ( qq(<FONT COLOR="blue">$tag</FONT>),$value ); } print $stone−>asHTML($callback);
Stone::dump()
This is a debugging tool. It iterates through the Stone object and prints out all the tags and values.
Example:
$s−>dump; person[0].children[0]=Tom person[0].children[1]=Mary person[0].name[0]=Fred person[0].pets[0]=Fido person[0].pets[1]=Rex person[0].pets[2]=Lassie person[0].age[0]=30 person[1].name[0]=Harry person[1].pets[0]=Rover person[1].pets[1]=Spot person[1].age[0]=23
$cursor = $stone−>cursor()
Retrieves an iterator over the object. You can call this several times in order to return independent iterators. The following brief example is described in more detail in Stone::Cursor.
my $curs = $stone−>cursor; while (my($tag,$value) = $curs−>next_pair) { print "$tag => $value\n"; } # yields: Sally[0].Address[0].Zip[0] => 10578 Sally[0].Address[0].City[0] => Katonah Sally[0].Address[0].Street[0] => Hickory Street Sally[0].Address[0].State[0] => NY Sally[0].Last_name[0] => James Sally[0].Age[0] => 30 Sally[0].First_name[0] => Sarah Jim[0].Address[0].Zip[0] => 11291 Jim[0].Address[0].City[0] => Garden City Jim[0].Address[0].Street[0] => The Manse Jim[0].Address[0].Street[1] => 19 Chestnut Ln Jim[0].Address[0].State[0] => NY Jim[0].Last_name[0] => Hill Jim[0].Age[0] => 34 Jim[0].First_name[0] => James
Lincoln D. Stein <lstein AT cshl DOT org>.
Copyright 1997−1999, Cold Spring Harbor Laboratory, Cold Spring Harbor NY. This module can be used and distributed on the same terms as Perl itself.
Boulder::Blast, Boulder::Genbank, Boulder::Medline, Boulder::Unigene, Boulder::Omim, Boulder::SwissProt