192
192
.. _special encoder: http://svn.red-bean.com/bob/simplejson/tags/simplejson-1.7/docs/index.html
194
.. _topics-serialization-natural-keys:
199
.. versionadded:: 1.2
201
The ability to use natural keys when serializing/deserializing data was
202
added in the 1.2 release.
204
The default serialization strategy for foreign keys and many-to-many
205
relations is to serialize the value of the primary key(s) of the
206
objects in the relation. This strategy works well for most types of
207
object, but it can cause difficulty in some circumstances.
209
Consider the case of a list of objects that have foreign key on
210
:class:`ContentType`. If you're going to serialize an object that
211
refers to a content type, you need to have a way to refer to that
212
content type. Content Types are automatically created by Django as
213
part of the database synchronization process, so you don't need to
214
include content types in a fixture or other serialized data. As a
215
result, the primary key of any given content type isn't easy to
216
predict - it will depend on how and when :djadmin:`syncdb` was
217
executed to create the content types.
219
There is also the matter of convenience. An integer id isn't always
220
the most convenient way to refer to an object; sometimes, a
221
more natural reference would be helpful.
223
It is for these reasons that Django provides *natural keys*. A natural
224
key is a tuple of values that can be used to uniquely identify an
225
object instance without using the primary key value.
227
Deserialization of natural keys
228
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
230
Consider the following two models::
232
from django.db import models
234
class Person(models.Model):
235
first_name = models.CharField(max_length=100)
236
last_name = models.CharField(max_length=100)
238
birthdate = models.DateField()
241
unique_together = (('first_name', 'last_name'),)
243
class Book(models.Model):
244
name = models.CharField(max_length=100)
245
author = models.ForeignKey(Person)
247
Ordinarily, serialized data for ``Book`` would use an integer to refer to
248
the author. For example, in JSON, a Book might be serialized as::
253
"model": "store.book",
255
"name": "Mostly Harmless",
261
This isn't a particularly natural way to refer to an author. It
262
requires that you know the primary key value for the author; it also
263
requires that this primary key value is stable and predictable.
265
However, if we add natural key handling to Person, the fixture becomes
266
much more humane. To add natural key handling, you define a default
267
Manager for Person with a ``get_by_natural_key()`` method. In the case
268
of a Person, a good natural key might be the pair of first and last
271
from django.db import models
273
class PersonManager(models.Manager):
274
def get_by_natural_key(self, first_name, last_name):
275
return self.get(first_name=first_name, last_name=last_name)
277
class Person(models.Model):
278
objects = PersonManager()
280
first_name = models.CharField(max_length=100)
281
last_name = models.CharField(max_length=100)
283
birthdate = models.DateField()
286
unique_together = (('first_name', 'last_name'),)
288
Now books can use that natural key to refer to ``Person`` objects::
293
"model": "store.book",
295
"name": "Mostly Harmless",
296
"author": ["Douglas", "Adams"]
301
When you try to load this serialized data, Django will use the
302
``get_by_natural_key()`` method to resolve ``["Douglas", "Adams"]``
303
into the primary key of an actual ``Person`` object.
307
Whatever fields you use for a natural key must be able to uniquely
308
identify an object. This will usually mean that your model will
309
have a uniqueness clause (either unique=True on a single field, or
310
``unique_together`` over multiple fields) for the field or fields
311
in your natural key. However, uniqueness doesn't need to be
312
enforced at the database level. If you are certain that a set of
313
fields will be effectively unique, you can still use those fields
316
Serialization of natural keys
317
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
319
So how do you get Django to emit a natural key when serializing an object?
320
Firstly, you need to add another method -- this time to the model itself::
322
class Person(models.Model):
323
objects = PersonManager()
325
first_name = models.CharField(max_length=100)
326
last_name = models.CharField(max_length=100)
328
birthdate = models.DateField()
330
def natural_key(self):
331
return (self.first_name, self.last_name)
334
unique_together = (('first_name', 'last_name'),)
336
That method should always return a natural key tuple -- in this
337
example, ``(first name, last name)``. Then, when you call
338
``serializers.serialize()``, you provide a ``use_natural_keys=True``
341
>>> serializers.serialize([book1, book2], format='json', indent=2, use_natural_keys=True)
343
When ``use_natural_keys=True`` is specified, Django will use the
344
``natural_key()`` method to serialize any reference to objects of the
345
type that defines the method.
347
If you are using :djadmin:`dumpdata` to generate serialized data, you
348
use the `--natural` command line flag to generate natural keys.
352
You don't need to define both ``natural_key()`` and
353
``get_by_natural_key()``. If you don't want Django to output
354
natural keys during serialization, but you want to retain the
355
ability to load natural keys, then you can opt to not implement
356
the ``natural_key()`` method.
358
Conversely, if (for some strange reason) you want Django to output
359
natural keys during serialization, but *not* be able to load those
360
key values, just don't define the ``get_by_natural_key()`` method.
362
Dependencies during serialization
363
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
365
Since natural keys rely on database lookups to resolve references, it
366
is important that data exists before it is referenced. You can't make
367
a `forward reference` with natural keys - the data you are referencing
368
must exist before you include a natural key reference to that data.
370
To accommodate this limitation, calls to :djadmin:`dumpdata` that use
371
the :djadminopt:`--natural` option will serialize any model with a
372
``natural_key()`` method before it serializes normal key objects.
374
However, this may not always be enough. If your natural key refers to
375
another object (by using a foreign key or natural key to another object
376
as part of a natural key), then you need to be able to ensure that
377
the objects on which a natural key depends occur in the serialized data
378
before the natural key requires them.
380
To control this ordering, you can define dependencies on your
381
``natural_key()`` methods. You do this by setting a ``dependencies``
382
attribute on the ``natural_key()`` method itself.
384
For example, consider the ``Permission`` model in ``contrib.auth``.
385
The following is a simplified version of the ``Permission`` model::
387
class Permission(models.Model):
388
name = models.CharField(max_length=50)
389
content_type = models.ForeignKey(ContentType)
390
codename = models.CharField(max_length=100)
392
def natural_key(self):
393
return (self.codename,) + self.content_type.natural_key()
395
The natural key for a ``Permission`` is a combination of the codename for the
396
``Permission``, and the ``ContentType`` to which the ``Permission`` applies. This means
397
that ``ContentType`` must be serialized before ``Permission``. To define this
398
dependency, we add one extra line::
400
class Permission(models.Model):
402
def natural_key(self):
403
return (self.codename,) + self.content_type.natural_key()
404
natural_key.dependencies = ['contenttypes.contenttype']
406
This definition ensures that ``ContentType`` models are serialized before
407
``Permission`` models. In turn, any object referencing ``Permission`` will
408
be serialized after both ``ContentType`` and ``Permission``.