3
Tangram - Mapping inheritance
7
There are many ways of representing inheritance relationships in a relational
8
database. This document describes three popular ways and how Tangram supports
11
=head1 STRATEGIES FOR MAPPING INHERITANCE
13
Inheritance is a concept that has no equivalent in the relational
14
world. However, it is possible to implement it by using a combination of
15
relational features like tables and foreign keys.
17
One of the paramount issues about mapping inheritance is how well the mapping
18
supports polymorphism. Any Object-Oriented persistence facility that deserves
19
its name needs to allow the retrieval of all the Fruits, and return a
20
heterogeneous collection of Apples, Oranges and Bananas. Also, it must
21
perform this operation in an efficient manner. In particular, polymorphic
22
retrieval should not cost one SELECT per retrieved object.
24
A secondary - yet important - issue is how well the mapping plays by the rules
25
of orthogonal orthodoxy.
27
Another issue we'll examine is how well the mapping supports 'complex' queries,
28
that is, queries that involve several objects
30
Three strategies are in common use, that go by the name Vertical, Horizontal
31
and Filtered mapping. They all have advantages and disadvantages.
33
The following sections describe the three strategies in details. They make use of
34
a simple object model to illustrate the mappings.
37
+---------------------+
40
+---------<------- 1 +---------------------+
42
| +---------------------+
46
| +------------------+---------------------+
48
| +---------------+ +-----------------+
49
V | NaturalPerson | | LegalPerson |
50
| +---------------+ +-----------------+
51
| | age: integer | | form: string |
52
| +---------------+ +-----------------+
56
| +---------------------+
57
+-------->-------- * | Vehicle |
59
+---------------------+
61
+---------------------+
65
+------------------+-------------------+
67
+---------------+ +-----------------+
69
+---------------+ +-----------------+
70
| plate: string | | ident: string |
71
+---------------+ +-----------------+
74
=head1 Horizontal Mapping
78
Each I<concrete> class is mapped onto a single table. Each row in the
79
table describes the persistent state of one object.
81
The attributes are mapped onto columns, usually one column per
82
attribute but not necessarily. For example, collections may be stored elsewhere
83
(for example on a link table) and thus require no column on the class' table.
85
In effect, the database looks like this:
89
+------+--------+-------+------+
91
================================
92
| 17 | Bill Gates | 46 |
93
+------+----------------+------+
94
| 23 | Georges Bush | 50 |
95
+------+----------------+------+
100
+------+------+---------+------+
102
================================
103
| 36 | Microsoft | Inc |
104
+------+----------------+------+
109
+------+-------+----------------+--------+
110
| id | owner | make | plate |
111
==========================================
112
| 12 | 17 | Saab | BILL-1 |
113
+------+-------+----------------+--------+
114
| 50 | 36 | Miata | MS-001 |
115
+------+-------+----------------+--------+
116
| 51 | 36 | Miata | MS-002 |
117
+------+-------+----------------+--------+
122
+------++-----+----------------+--------+
123
| id | owner| make | ident |
124
=========================================
125
| 29 | 23 | Boeing | AF-001 |
126
+------+------+----------------+--------+
130
Polymorphic retrieval costs one SELECT per concrete conforming class; retrieving
131
all the Persons costs two SELECTs. These SELECTs, however, don't use joins -
132
an expensive operation. In our example, retrieving all the Persons requires the
133
following two SELECTs:
135
SELECT id, name, age FROM NaturalPerson
136
SELECT id, name, form FROM LegalPerson
140
This mapping is reasonable with regard to relational orthodoxy, but not perfect:
141
the 'name' column is present on two different tables, with the same semantic.
143
The biggest drawback, however, happens when you try to perfrom complex queries.
144
Suppose oyu want to retrieve all the Persons (Natural- or Legal-) that own a
145
Vehicle of make 'Saab' (be it a Car or a Plane). Sticking with equijoins, the
146
cost of the operation is four SELECTs:
148
SELECT NaturalPerson.id, NaturalPerson.name, NaturalPerson.age
149
FROM NaturalPerson, Car
150
WHERE Car.owner = NaturalPerson.id
152
SELECT NaturalPerson.id, NaturalPerson.name, NaturalPerson.age
153
FROM NaturalPerson, Plane
154
WHERE Plane.owner = NaturalPerson.id
156
SELECT LegalPerson.id, LegalPerson.name, LegalPerson.form
157
FROM LegalPerson, Car
158
WHERE Car.owner = LegalPerson.id
160
SELECT LegalPerson.id, LegalPerson.name, LegalPerson.form
161
FROM LegalPerson, Plane
162
WHERE Plane.owner = LegalPerson.id
164
When the depth of the hierarchies increase, the combinatory explosion makes
165
complex queries prohibitive.
167
=head1 Vertical Mapping
171
Each class has its corresponding table, which contains only the class' direct
172
fields. In other words, the table doesn't store the inherited fields. Both
173
concrete and abstract classes get a table. The state of an object is
174
thus scattered over several tables.
180
+------+-+------+-------+
182
=========================
184
+------+----------------+
185
| 23 | Georges Bush |
186
+------+----------------+
188
+------+----------------+
191
+---------------+ +-------------+
192
| NaturalPerson | | LegalPerson |
193
+------+--------+ +-------+-----++
194
| id | age | | id | form |
195
================= ================
196
| 17 | 46 | | 36 | Inc |
197
+------+--------+ +-------+------+
203
+------+--+----+----------------+
204
| id | owner | make |
205
=================================
207
+------+-------+----------------+
209
+------+-------+----------------+
211
+------+-------+----------------+
213
+------+-------+----------------+
217
+------++--------+ +-------+--------+
218
| id | plate | | id | ident |
219
================== ==================
220
| 12 | BILL-1 | | 29 | AF-001 |
221
+-------+--------+ +-------+--------+
227
Polymorphic retrieval is achieved by issuing one SELECT per concrete
228
conforming class; retrieving In our example, retrieving all the
229
Persons requires the following two SELECTs:
231
SELECT Person.id, Person.name, NaturalPerson.age
232
FROM Person, NaturalPerson
233
WHERE Person.id = NaturalPerson.id
235
SELECT Person.id, Person.name, LegalPerson.form
236
FROM Person, LegalPerson
237
WHERE Person.id = LegalPerson.id
239
This mapping sometimes needs an extra column that carries a type
240
identifier. In our example, we take the very resonable assumption that
241
Person is an abstract class. Had we decided to allow 'pure' Persons,
242
we would have been faced with the following problem: the Person table
243
would contain rows that describe pure Persons, but also rows that
244
describe the Person part of Natural- and LegalPersons. We would need
245
to filter those incomplete objects out when retrieving the pure
246
Persons. Thus the Person table would look like this:
250
+-----+--+---+----------------+
252
===============================
253
| 13 | 1 | Pure Person |
254
+-----+------+----------------+
255
| 17 | 2 | Bill Gates |
256
+-----+------+----------------+
257
| 23 | 2 | Georges Bush |
258
+-----+------+----------------+
259
| 36 | 3 | Microsoft |
260
+-----+------+----------------+
262
In this case, we need an extra SELECT for retrieving pure Persons:
264
SELECT Person.id, Person.name
266
WHERE Person.type IN (1)
270
From the relational point of view, this mapping is excellent: the
271
resulting database is in third normal form.
273
This mapping also supports complex queries very well. Take the Saab
274
owners example again: we don't need to involve the Car nor Plane
275
tables in the query. As a result, two SELECTs suffice:
277
SELECT Person.id, Person.name, NaturalPerson.age
278
FROM Person, NaturalPerson, Vehicle
279
WHERE Person.id = NaturalPerson.id AND Vehicle.owner = Person.id
281
SELECT Person.id, Person.name, LegalPerson.form
282
FROM Person, LegalPerson, Vehicle
283
WHERE Person.id = LegalPerson.id AND Vehicle.owner = Person.id
288
The mapping potentially has the highest performance cost: it requires
289
multiple SELECTs like the horizontal mapping, but in addition, these
293
=head1 Filtered Mapping
297
Entire hierarchies are mapped onto a single table. Two rows may
298
describe objects of different types, maybe completely unrelated. The
299
set of columns is the uperset of all the columns needed by all the
300
attributes of any of the classes involved in the mapping.
302
A special 'type' column contains an value that uniquely identifies the
303
concrete class of the object described by the row.
305
All the columns related to attributes that don't occur in all the
306
classes must be declared as NULLABLE. Indeed, the table may contain
309
In our example, the database may look either like this:
313
+-----+---+--+----------------+------+------+
314
| id | type | name | age | form |
315
=============================================
316
| 17 | 1 | Bill Gates | 46 | NULL |
317
+-----+------+----------------+------+------+
318
| 23 | 1 | Georges Bush | 50 | NULL |
319
+-----+------+----------------+------+------+
320
| 36 | 2 | Microsoft | NULL | Inc |
321
+-----+------+----------------+------+------+
325
+-----+---+--+----------------+------+------+
326
| id | type | name | age | form |
327
=============================================
328
| 17 | 1 | Bill Gates | 46 | NULL |
329
+-----+------+----------------+------+------+
330
| 23 | 1 | Georges Bush | 50 | NULL |
331
+-----+------+----------------+------+------+
332
| 36 | 2 | Microsoft | NULL | Inc |
333
+-----+------+----------------+------+------+
334
| 36 | 2 | Microsoft | NULL | Inc |
335
+-----+------+----------------+------+------+
339
+-----+----+-+-------+----------------+--------+--------+
340
| id | type | owner | make | plate | ident |
341
=========================================================
342
| 12 | 3 | 17 | Saab | BILL-1 | NULL |
343
+-----+------+-------+----------------+--------+--------+
344
| 29 | 4 | 23 | Boeing | NULL | AF-001 |
345
+-----+------+-------+----------------+--------+--------+
346
| 50 | 3 | 36 | Miata | MS-001 | NULL |
347
+-----+------+-------+----------------+--------+--------+
348
| 51 | 3 | 36 | Miata | MS-002 | NULL |
349
+-----+------+-------+----------------+--------+--------+
351
Retrieving all the Persons requires only one SELECT:
353
SELECT id, name, age, form FROM Persons
355
When retrieving NaturalPersons we must take care to filter out the
356
rows that belog to LegalPersons:
358
SELECT id, name, age FROM Persons WHERE type = 1
360
We may even decide to place unrelated hierarchies on the same table:
364
+-----+---+--+---------------+------+------+--------+--------+--------+
365
| id | type | name | age | form | make | plate | ident |
366
=======================================================================
367
| 17 | 1 | Bill Gates | 46 | NULL | NULL | NULL | NULL |
368
+-----+------+---------------+------+------+--------+--------+--------+
369
| 23 | 1 | Georges Bush | 50 | NULL | NULL | NULL | NULL |
370
+-----+------+---------------+------+------+--------+--------+--------+
371
| 36 | 2 | Microsoft | NULL | Inc | NULL | NULL | NULL |
372
+-----+------+---------------+------+------+--------+--------+--------+
373
| 12 | 3 | NULL | NULL | NULL | Saab | BILL-1 | NULL |
374
+-----+------+---------------+------+------+--------+--------+--------+
375
| 29 | 4 | NULL | NULL | NULL | Boeing | NULL | AF-001 |
376
+-----+------+---------------+------+------+--------+--------+--------+
377
| 50 | 3 | NULL | NULL | NULL | Miata | MS-001 | NULL |
378
+-----+------+---------------+------+------+--------+--------+--------+
379
| 51 | 3 | NULL | NULL | NULL | Miata | MS-002 | NULL |
380
+-----+------+---------------+------+------+--------+--------+--------+
384
Polymorphic retrieval costs exactly one SELECT, regardless of the
385
number of conforming types. Thus this mapping potentially is the most
390
This mapping is very questionable according to relational
391
orthodoxy. Even if one decides to forgo these rules, using such a
392
mapping takes away many of the interesting features offered by modern
393
RDBM systems. Because nearly all the columns must allow NULL values,
394
we cannot take advantage of features like referential integrity
395
constraints, domain constraints, indexes, etc.
397
Also, as the table becomes cluttered with NULL values, the relative
398
number of significant columns in any given row tends towards zero: we
399
may end up retrieving rows consisting of a little information swimming
402
In effect, this mapping may end up hindering performance instead of
403
improving it in presence of deep hierarchies with many attributes.
405
=head1 MAPPINGS SUPPORTED BY TANGRAM
407
Tangram supports both vertical mapping and filtered mapping, and any
410
The 'table' attribute in the class description in the Schema can be
411
used to put the state of several classes on the same table. The table
412
name defaults to the class name, resulting in a vertical mapping.
414
For example, the following schema:
416
Tangram::Relational->schema( {
421
fields => { string => [ qw( name ) ] }
427
fields => { int => [ qw( age ) ] }
433
fields => { string => [ qw( form ) ] }
437
...specifies a pure filtered mapping for the Person hierarchy:
443
type INTEGER NOT NULL,
444
form VARCHAR(255) NULL,
446
name VARCHAR(255) NULL
449
The following schema:
451
Tangram::Relational->schema( {
456
fields => { string => [ qw( name ) ] }
461
table => 'NaturalPerson',
462
fields => { int => [ qw( age ) ] }
468
fields => { string => [ qw( form ) ] }
472
...gives NaturalPerson its own table, but LegalPerson shares the
479
type INTEGER NOT NULL,
480
form VARCHAR(255) NULL,
481
name VARCHAR(255) NULL
484
CREATE TABLE NaturalPerson
488
type INTEGER NOT NULL,