1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
5
>Dealing with nested structures in tables</TITLE
8
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
10
TITLE="PyTables User's Guide"
11
HREF="index.html"><LINK
14
HREF="c514.html"><LINK
16
TITLE="Using enumerated types"
17
HREF="x1185.html"><LINK
19
TITLE="Other examples in PyTables distribution"
20
HREF="x1376.html"></HEAD
31
SUMMARY="Header navigation table"
43
> User's Guide: Hierarchical datasets in Python - Release 1.3.2</TH
59
>Chapter 3. Tutorials</TD
80
>3.7. Dealing with nested structures in tables</A
83
>PyTables supports the handling of nested structures (or
84
nested datatypes, as you prefer) in table objects, allowing
85
you to define arbitrarily nested columns.
88
>An example will clarify what this means. Let's suppose
89
that you want to group your data in pieces of information
90
that are more related than others pieces in your table, So
91
you may want to tie them up together in order to have your
92
table better structured but also be able to retrieve and
93
deal with these groups more easily.
96
>You can create such a nested substructures by just nesting
98
CLASS="computeroutput"
101
example (okay, it's a bit silly, but will serve for
102
demonstration purposes):
106
> class Info(IsDescription):
107
"""A sub-structure of Test"""
108
_v_pos = 2 # The position in the whole structure
110
value = Float64Col(pos=0)
112
colors = Enum(['red', 'green', 'blue']) # An enumerated type
114
class NestedDescr(IsDescription):
115
"""A description that has several nested columns"""
116
color = EnumCol(colors, 'red', dtype='UInt32', indexed=1) # indexed column
118
class info2(IsDescription):
121
value = Float64Col(pos=0)
122
class info3(IsDescription):
128
>The root class is <SAMP
129
CLASS="computeroutput"
133
CLASS="computeroutput"
136
CLASS="computeroutput"
145
> of it. Note how <SAMP
146
CLASS="computeroutput"
149
actually an instance of the class <SAMP
150
CLASS="computeroutput"
153
defined prior to <SAMP
154
CLASS="computeroutput"
157
third substructure, namely <SAMP
158
CLASS="computeroutput"
161
from the substructure <SAMP
162
CLASS="computeroutput"
165
define positions of substructures in the containing object
166
by declaring the special class attribute
168
CLASS="computeroutput"
177
NAME="subsection3.7.1"
178
>3.7.1. Nested table creation</A
181
>Now that we have defined our nested structure, let's
188
> table, that is a table with
189
columns that contain other subcolumns.
193
> >>> from tables import *
194
>>> fileh = openFile("nested-tut.h5", "w")
195
>>> table = fileh.createTable(fileh.root, 'table', NestedDescr)
199
>Done! Now, we have to feed the table with some
200
values. The problem is how we are going to reference to
201
the nested fields. That's easy, just use a
203
CLASS="computeroutput"
205
> character to separate names in different
206
nested levels. Look at this:
210
> >>> for i in range(10):
211
... row['color'] = colors[['red', 'green', 'blue'][i%3]]
212
... row['info1/name'] = "name1-%s" % i
213
... row['info2/name'] = "name2-%s" % i
214
... row['info2/info3/y'] = i
215
... # All the rest will be filled with defaults
218
>>> table.flush()
219
>>> table.nrows
224
>You see? In order to fill the fields located in the
225
substructures, we just need to specify its full path
226
in the table hierarchy.
234
NAME="subsection3.7.2"
235
>3.7.2. Reading nested tables: introducing <SPAN
237
>NestedRecArray</SPAN
241
>Now, what happens if we want to read the table? Which
242
data container will be used to keep the data? Well, it's
247
> >>> nra = table[::4]
248
>>> print nra
250
(((1.0, 0), 'name2-0', 0.0), ('name1-0', 0.0), 0L),
251
(((1.0, 4), 'name2-4', 0.0), ('name1-4', 0.0), 1L),
252
(((1.0, 8), 'name2-8', 0.0), ('name1-8', 0.0), 2L)
257
>We have read one row for each four in the table, giving a
258
result of three rows. What about the container? Well, we
259
can see that it is a new mysterious object known as
261
CLASS="computeroutput"
262
>NestedRecArray</SAMP
263
>. If we ask for more info on
268
> >>> type(nra)
269
<class 'tables.nestedrecords.NestedRecArray'>
272
>we see that it is an instance of the class
274
CLASS="computeroutput"
275
>NestedRecArray</SAMP
276
> that lives in the module
278
CLASS="computeroutput"
281
CLASS="computeroutput"
285
CLASS="computeroutput"
286
>NestedRecArray</SAMP
288
subclass of the <SAMP
289
CLASS="computeroutput"
293
CLASS="computeroutput"
296
CLASS="computeroutput"
299
package. You can see more info about
301
CLASS="computeroutput"
302
>NestedRecArray</SAMP
304
HREF="a6736.html#NestedRecArrayClassDescr"
309
>You can make use of the above object in many different
310
ways. For example, you can use it to append new data to
311
the existing table object:
315
> >>> table.append(nra)
316
>>> table.nrows
321
>Or, to create new tables:
325
> >>> table2 = fileh.createTable(fileh.root, 'table2', nra)
326
>>> table2[:]
328
[(((1.0, 0), 'name2-0', 0.0), ('name1-0', 0.0), 0L),
329
(((1.0, 4), 'name2-4', 0.0), ('name1-4', 0.0), 1L),
330
(((1.0, 8), 'name2-8', 0.0), ('name1-8', 0.0), 2L)],
331
descr=[('info2', [('info3', [('x', '1f8'), ('y', '1u1')]), ('name',
332
'1a10'), ('value', '1f8')]), ('info1', [('name', '1a10'), ('value',
333
'1f8')]), ('color', '1u4')], shape=3)
336
>Finally, we can select nested values that fulfill some
341
> >>> names = [ x['info2/name'] for x in table if x['color'] == colors.red ]
342
>>> names
343
['name2-0', 'name2-3', 'name2-6', 'name2-9', 'name2-0']
347
>Note that the row accessor does not provide the natural
348
naming feature, so you have to completely specify the path
349
of your desired columns in order to reach them.
357
NAME="subsection3.7.3"
358
>3.7.3. Using Cols accessor</A
361
>We can use the <SAMP
362
CLASS="computeroutput"
364
> attribute object (see
366
HREF="x3528.html#ColsClassDescr"
368
>) of the table so as to
369
quickly access the info located in the interesting
374
> >>> table.cols.info2[1:5]
376
[((1.0, 1), 'name2-1', 0.0),
377
((1.0, 2), 'name2-2', 0.0),
378
((1.0, 3), 'name2-3', 0.0),
379
((1.0, 4), 'name2-4', 0.0)],
380
descr=[('info3', [('x', '1f8'), ('y', '1u1')]), ('name', '1a10'),
386
>Here, we have made use of the cols accessor to access to
393
> substructure and an slice operation to
394
get access to the subset of data we were interested in;
395
you probably have recognized the natural naming approach
396
here. We can continue and ask for data in <SPAN
407
> >>> table.cols.info2.info3[1:5]
413
descr=[('x', '1f8'), ('y', '1u1')],
418
>You can also use the <SAMP
419
CLASS="computeroutput"
422
handler for a column:
426
> >>> table.cols._f_col('info2')
427
/table.cols.info2 (Cols), 3 columns
428
info3 (Cols(1,), Description)
429
name (Column(1,), CharType)
430
value (Column(1,), Float64)
433
>Here, you've got another <SAMP
434
CLASS="computeroutput"
443
> was a nested column. If you select
444
a non-nested column, you will get a regular
446
CLASS="computeroutput"
452
> >>> ycol = table.cols._f_col('info2/info3/y')
454
/table.cols.info2.info3.y (Column(1,), UInt8, idx=None)
458
>To sum up, the <SAMP
459
CLASS="computeroutput"
461
> accessor is a very handy
462
and powerful way to access data in your nested tables. Be
463
sure of using it, specially when doing interactive work.
471
NAME="subsection3.7.4"
472
>3.7.4. Accessing meta-information of nested
476
>Tables have an attribute called <SAMP
477
CLASS="computeroutput"
480
which points to an instance of the
482
CLASS="computeroutput"
485
HREF="x3623.html#DescriptionClassDescr"
488
discover different meta-information about table
492
>Let's see how it looks like:
496
> >>> table.description
500
"x": FloatCol(dflt=1, shape=1, itemsize=8, pos=0, indexed=False),
501
"y": UInt8Col(dflt=1, shape=1, pos=1, indexed=False)},
502
"name": StringCol(length=10, dflt=None, shape=1, pos=1, indexed=False),
503
"value": Float64Col(dflt=0.0, shape=1, pos=2, indexed=False)},
505
"name": StringCol(length=10, dflt=None, shape=1, pos=0, indexed=False),
506
"value": Float64Col(dflt=0.0, shape=1, pos=1, indexed=False)},
507
"color": EnumCol(Enum({'blue': 2, 'green': 1, 'red': 0}), 'red',
508
dtype='UInt32', shape=1, pos=2, indexed=1)}
512
>As you can see, it provides very useful information on
513
both the formats and the structure of the columns in your
517
>This object also provides a natural naming approach to
518
access to subcolumns metadata:
522
> >>> table.description.info1
524
"name": StringCol(length=10, dflt=None, shape=1, pos=0, indexed=False),
525
"value": Float64Col(dflt=0.0, shape=1, pos=1, indexed=False)}
526
>>> table.description.info2.info3
528
"x": FloatCol(dflt=1, shape=1, itemsize=8, pos=0, indexed=False),
529
"y": UInt8Col(dflt=1, shape=1, pos=1, indexed=False)}
533
>There are other variables that can be interesting for you:
537
> >>> table.description._v_nestedNames
538
[('info2', [('info3', ['x', 'y']), 'name', 'value']), ('info1',
539
['name', 'value']), 'color']
540
>>> table.description.info1._v_nestedNames
546
CLASS="computeroutput"
547
>_v_nestedNames</SAMP
548
> provides the names of the
549
columns as well as its structure. You can see that there
550
are the same attributes for the different levels of the
552
CLASS="computeroutput"
554
> object, because the levels are
562
CLASS="computeroutput"
564
> objects themselves.
567
>There is a special attribute, called
569
CLASS="computeroutput"
570
>_v_nestedDescr</SAMP
571
> that can be useful to create
573
CLASS="computeroutput"
574
>NestedRecArrays</SAMP
575
> objects that imitate the
576
structure of the table (or a subtable!):
580
> >>> from tables import nestedrecords
581
>>> table.description._v_nestedDescr
582
[('info2', [('info3', [('x', '1f8'), ('y', '1u1')]), ('name', '1a10'),
583
('value', '1f8')]), ('info1', [('name', '1a10'), ('value', '1f8')]),
585
>>> nestedrecords.array(None, descr=table.description._v_nestedDescr)
588
descr=[('info2', [('info3', [('x', '1f8'), ('y', '1u1')]), ('name',
589
'1a10'), ('value', '1f8')]), ('info1', [('name', '1a10'), ('value',
590
'1f8')]),('color', '1u4')], shape=0)
591
>>> nestedrecords.array(None, descr=table.description.info2._v_nestedDescr)
594
descr=[('info3', [('x', '1f8'), ('y', '1u1')]), ('name', '1a10'),
595
('value', '1f8')], shape=0)
600
HREF="x3623.html#DescriptionClassDescr"
603
for the complete listing of attributes.
606
>Finally, there is a special iterator of the
608
CLASS="computeroutput"
610
> class, called <SAMP
611
CLASS="computeroutput"
614
that is able to return you the different columns of the
619
> >>> for coldescr in table.description._v_walk():
620
... print "column-->",coldescr
622
column--> Description([('info2', [('info3', [('x', '1f8'), ('y',
623
'1u1')]), ('name', '1a10'), ('value', '1f8')]), ('info1', [('name',
624
'1a10'), ('value', '1f8')]), ('color', '1u4')])
625
column--> EnumCol(Enum({'blue': 2, 'green': 1, 'red': 0}), 'red',
626
dtype='UInt32', shape=1, pos=2, indexed=1)
627
column--> Description([('info3', [('x', '1f8'), ('y', '1u1')]),
628
('name', '1a10'), ('value', '1f8')])
629
column--> StringCol(length=10, dflt=None, shape=1, pos=1, indexed=False)
630
column--> Float64Col(dflt=0.0, shape=1, pos=2, indexed=False)
631
column--> Description([('name', '1a10'), ('value', '1f8')])
632
column--> StringCol(length=10, dflt=None, shape=1, pos=0, indexed=False)
633
column--> Float64Col(dflt=0.0, shape=1, pos=1, indexed=False)
634
column--> Description([('x', '1f8'), ('y', '1u1')])
635
column--> FloatCol(dflt=1, shape=1, itemsize=8, pos=0, indexed=False)
636
column--> UInt8Col(dflt=1, shape=1, pos=1, indexed=False)
640
>Well, this is the end of this tutorial. As always, do not
641
forget to close your files:
645
> >>> fileh.close()
649
>Finally, you may want to have a look at your resulting
654
> $ ptdump -d nested-tut.h5
656
/table (Table(13L,)) ''
658
[0] (((1.0, 0), 'name2-0', 0.0), ('name1-0', 0.0), 0L)
659
[1] (((1.0, 1), 'name2-1', 0.0), ('name1-1', 0.0), 1L)
660
[2] (((1.0, 2), 'name2-2', 0.0), ('name1-2', 0.0), 2L)
661
[3] (((1.0, 3), 'name2-3', 0.0), ('name1-3', 0.0), 0L)
662
[4] (((1.0, 4), 'name2-4', 0.0), ('name1-4', 0.0), 1L)
663
[5] (((1.0, 5), 'name2-5', 0.0), ('name1-5', 0.0), 2L)
664
[6] (((1.0, 6), 'name2-6', 0.0), ('name1-6', 0.0), 0L)
665
[7] (((1.0, 7), 'name2-7', 0.0), ('name1-7', 0.0), 1L)
666
[8] (((1.0, 8), 'name2-8', 0.0), ('name1-8', 0.0), 2L)
667
[9] (((1.0, 9), 'name2-9', 0.0), ('name1-9', 0.0), 0L)
668
[10] (((1.0, 0), 'name2-0', 0.0), ('name1-0', 0.0), 0L)
669
[11] (((1.0, 4), 'name2-4', 0.0), ('name1-4', 0.0), 1L)
670
[12] (((1.0, 8), 'name2-8', 0.0), ('name1-8', 0.0), 2L)
671
/table2 (Table(3L,)) ''
673
[0] (((1.0, 0), 'name2-0', 0.0), ('name1-0', 0.0), 0L)
674
[1] (((1.0, 4), 'name2-4', 0.0), ('name1-4', 0.0), 1L)
675
[2] (((1.0, 8), 'name2-8', 0.0), ('name1-8', 0.0), 2L)
678
>Most of the code in this section is also available in
680
CLASS="computeroutput"
681
>examples/nested-tut.py</SAMP
686
CLASS="computeroutput"
689
comprehensive toolset to cope with nested structures and
690
address your classification needs. However, caveat
691
emptor, be sure to not nest your data too deeply or you
692
will get inevitably messed interpreting too intertwined
693
lists, tuples and description objects.
702
SUMMARY="Footer navigation table"
741
>Using enumerated types</TD
755
>Other examples in PyTables distribution</TD
b'\\ No newline at end of file'