4408.2.22
by Xavier Morel
[ADD] some import doc |
1 |
.. _bulk-import: |
2 |
||
3 |
Bulk Import
|
|
4 |
===========
|
|
5 |
||
6 |
OpenERP has included a bulk import facility for CSV-ish files for a |
|
7 |
long time. With 7.0, both the interface and internal implementation |
|
8 |
have been redone, resulting in |
|
9 |
:meth:`~openerp.osv.orm.BaseModel.load`. |
|
10 |
||
11 |
.. note:: |
|
12 |
||
13 |
the previous bulk-loading method, |
|
14 |
:meth:`~openerp.osv.orm.BaseModel.import_data`, remains for |
|
15 |
backwards compatibility but was re-implemented on top of |
|
16 |
:meth:`~openerp.osv.orm.BaseModel.load`, while its interface is |
|
17 |
unchanged its precise behavior has likely been altered for some |
|
18 |
cases (it shouldn't throw exceptions anymore in many cases where |
|
19 |
it previously did) |
|
20 |
||
21 |
This document attempts to explain the behavior and limitations of |
|
22 |
:meth:`~openerp.osv.orm.BaseModel.load`. |
|
23 |
||
24 |
Data
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
25 |
----
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
26 |
|
27 |
The input ``data`` is a regular row-major matrix of strings (in Python
|
|
28 |
datatype terms, a ``list`` of rows, each row being a ``list`` of |
|
29 |
``str``, all rows must be of equal length). Each row must be the same
|
|
30 |
length as the ``fields`` list preceding it in the argslist.
|
|
31 |
||
32 |
Each field of ``fields`` maps to a (potentially relational and nested)
|
|
33 |
field of the model under import, and the corresponding column of the |
|
34 |
``data`` matrix provides a value for the field for each record.
|
|
35 |
||
36 |
Generally speaking each row of the input yields a record of output, |
|
37 |
and each cell of a row yields a value for the corresponding field of |
|
38 |
the row's record. There is currently one exception for this rule: |
|
39 |
||
40 |
One to Many fields
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
41 |
++++++++++++++++++
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
42 |
|
43 |
Because O2M fields contain multiple records "embedded" in the main |
|
44 |
one, and these sub-records are fully dependent on the main record (are |
|
45 |
no other references to the sub-records in the system), they have to be |
|
46 |
spliced into the matrix somehow. This is done by adding lines composed |
|
47 |
*only* of o2m record fields below the main record:
|
|
48 |
||
4133.1.19
by Antony Lesuisse
rearrange toc, remove some deprecated stuff |
49 |
.. literalinclude:: 06_misc_import_o2m.txt |
4408.2.22
by Xavier Morel
[ADD] some import doc |
50 |
|
51 |
the sections in double-lines represent the span of two o2m |
|
52 |
fields. During parsing, they are extracted into their own ``data``
|
|
53 |
matrix for the o2m field they correspond to. |
|
54 |
||
55 |
Import process
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
56 |
--------------
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
57 |
|
58 |
Here are the phases of import. Note that the concept of "phases" is |
|
59 |
fuzzy as it's currently more of a pipeline, each record moves through |
|
60 |
the entire pipeline before the next one is processed. |
|
61 |
||
62 |
Extraction
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
63 |
++++++++++
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
64 |
|
65 |
The first phase of the import is the extraction of the current row |
|
66 |
(and potentially a section of rows following it if it has One to Many |
|
67 |
fields) into a record dictionary. The keys are the ``fields``
|
|
68 |
originally passed to :meth:`~openerp.osv.orm.BaseModel.load`, and the |
|
69 |
values are either the string value at the corresponding cell (for |
|
70 |
non-relational fields) or a list of sub-records (for all relational |
|
71 |
fields). |
|
72 |
||
73 |
This phase also generates the ``rows`` indexes for any
|
|
74 |
:ref:`import-message` produced thereafter. |
|
75 |
||
76 |
Conversion
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
77 |
++++++++++
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
78 |
|
4678
by Xavier Morel
[FIX] doc warnings: replace ref by term, add glossary |
79 |
This second phase takes the record dicts, extracts the :term:`database |
80 |
ID` and :term:`external ID` if present and attempts to convert each
|
|
81 |
field to a type matching what OpenERP expects to write. |
|
4408.2.43
by Xavier Morel
[FIX] import docs: clarify handling of empty cells |
82 |
|
83 |
* Empty fields (empty strings) are replaced with the ``False`` value |
|
84 |
||
85 |
* Non-empty fields are converted through
|
|
86 |
:class:`~openerp.addons.base.ir.ir_fields.ir_fields_converter` |
|
87 |
||
88 |
.. note:: if a field is specified in the import, its default will *never* be |
|
89 |
used. If some records need to have a value and others need to use |
|
90 |
the model's default, either specify that default explicitly or do |
|
91 |
the import in two phases. |
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
92 |
|
93 |
Char, text and binary fields
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
94 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
95 |
|
96 |
Are returned as-is, without any alteration. |
|
97 |
||
98 |
Boolean fields
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
99 |
~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
100 |
|
101 |
The string value is compared (in a case-insensitive manner) to ``0``,
|
|
102 |
``false`` and ``no`` as well of any translation thereof loaded in the |
|
103 |
database. If the value matches one of these, the field is set to |
|
104 |
``False``.
|
|
105 |
||
106 |
Otherwise the field is compared to ``1``, ``true`` and ``yes`` (and |
|
107 |
any translation of these in the database). The field is always set to |
|
108 |
``True``, but if the value does not match one of these a warning will
|
|
109 |
also be output. |
|
110 |
||
111 |
Integers and float fields
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
112 |
~~~~~~~~~~~~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
113 |
|
114 |
The field is parsed with Python's built-in conversion routines |
|
115 |
(``int`` and ``float`` respectively), if the conversion fails an error |
|
116 |
is generated. |
|
117 |
||
118 |
Selection fields
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
119 |
~~~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
120 |
|
121 |
The field is compared to 1. the values of the selection (first part of |
|
122 |
each selection tuple) and 2. all translations of the selection label |
|
123 |
found in the database. |
|
124 |
||
125 |
If one of these is matched, the corresponding value is set on the |
|
126 |
field. |
|
127 |
||
128 |
Otherwise an error is generated. |
|
129 |
||
130 |
The same process applies to both list-type and function-type selection |
|
131 |
fields. |
|
132 |
||
133 |
Many to One field
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
134 |
~~~~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
135 |
|
136 |
If the specified field is the relational field itself (``m2o``), the
|
|
137 |
value is used in a ``name_search``. The first record returned by
|
|
138 |
``name_search`` is used as the field's value.
|
|
139 |
||
140 |
If ``name_search`` finds no value, an error is generated. If
|
|
141 |
``name_search`` finds multiple value, a warning is generated to warn
|
|
142 |
the user of ``name_search`` collisions.
|
|
143 |
||
4678
by Xavier Morel
[FIX] doc warnings: replace ref by term, add glossary |
144 |
If the specified field is a :term:`external ID` (``m2o/id``), the |
4408.2.22
by Xavier Morel
[ADD] some import doc |
145 |
corresponding record it looked up in the database and used as the |
146 |
field's value. If no record is found matching the provided external |
|
147 |
ID, an error is generated. |
|
148 |
||
4678
by Xavier Morel
[FIX] doc warnings: replace ref by term, add glossary |
149 |
If the specified field is a :term:`database ID` (``m2o/.id``), the |
150 |
process is the same as for external ids (on database identifiers |
|
151 |
instead of external ones). |
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
152 |
|
153 |
Many to Many field
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
154 |
~~~~~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
155 |
|
156 |
The field's value is interpreted as a comma-separated list of names, |
|
157 |
external ids or database ids. For each one, the process previously |
|
158 |
used for the many to one field is applied. |
|
159 |
||
160 |
One to Many field
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
161 |
~~~~~~~~~~~~~~~~~
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
162 |
|
163 |
For each o2m record extracted, if the record has a ``name``,
|
|
4678
by Xavier Morel
[FIX] doc warnings: replace ref by term, add glossary |
164 |
:term:`external ID` or :term:`database ID` the :term:`database ID` is |
165 |
looked up and checked through the same process as for m2o fields. |
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
166 |
|
4678
by Xavier Morel
[FIX] doc warnings: replace ref by term, add glossary |
167 |
If a :term:`database ID` was found, a LINK_TO command is emmitted, |
168 |
followed by an UPDATE with the non-db values for the relational field. |
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
169 |
|
170 |
Otherwise a CREATE command is emmitted. |
|
171 |
||
4408.2.45
by Xavier Morel
[IMP] document date and datetime conversions |
172 |
Date fields
|
173 |
~~~~~~~~~~~
|
|
174 |
||
175 |
The value's format is checked against |
|
176 |
:data:`~openerp.tools.misc.DEFAULT_SERVER_DATE_FORMAT`, an error is |
|
177 |
generated if it does not match the specified format. |
|
178 |
||
179 |
Datetime fields
|
|
180 |
~~~~~~~~~~~~~~~
|
|
181 |
||
182 |
The value's format is checked against |
|
183 |
:data:`~openerp.tools.misc.DEFAULT_SERVER_DATETIME_FORMAT`, an error |
|
184 |
is generated if it does not match. |
|
185 |
||
186 |
The value is then interpreted as a datetime in the user's |
|
187 |
timezone. The timezone is specified thus: |
|
188 |
||
189 |
* If the import ``context`` contains a ``tz`` key with a valid |
|
190 |
timezone name, this is the timezone of the datetime. |
|
191 |
||
192 |
* Otherwise if the user performing the import has a ``tz`` attribute |
|
193 |
set to a valid timezone name, this is the timezone of the datetime. |
|
194 |
||
195 |
* Otherwise interpret the datetime as being in the ``UTC`` timezone. |
|
196 |
||
4408.2.22
by Xavier Morel
[ADD] some import doc |
197 |
Create/Write
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
198 |
++++++++++++
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
199 |
|
200 |
If the conversion was successful, the converted record is then saved |
|
201 |
to the database via ``(ir.model.data)._update``.
|
|
202 |
||
203 |
Error handling
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
204 |
++++++++++++++
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
205 |
|
206 |
The import process will only catch 2 types of exceptions to convert |
|
207 |
them to error messages: ``ValueError`` during the conversion process,
|
|
208 |
and sub-exceptions of ``psycopg2.Error`` during the create/write
|
|
209 |
process. |
|
210 |
||
211 |
The import process uses savepoint to: |
|
212 |
||
213 |
* protect the overall transaction from the failure of each ``_update`` |
|
214 |
call, if an ``_update`` call fails the savepoint is rolled back and
|
|
215 |
the import process keeps going in order to obtain as many error |
|
216 |
messages as possible during each run. |
|
217 |
||
218 |
* protect the import as a whole, a savepoint is created before
|
|
219 |
starting and if any error is generated that savepoint is rolled |
|
220 |
back. The rest of the transaction (anything not within the import |
|
221 |
process) will be left untouched. |
|
222 |
||
223 |
.. _import-message: |
|
224 |
.. _import-messages: |
|
225 |
||
226 |
Messages
|
|
4408.2.44
by Xavier Morel
[FIX] titles of import doc |
227 |
--------
|
4408.2.22
by Xavier Morel
[ADD] some import doc |
228 |
|
229 |
A message is a dictionary with 5 mandatory keys and one optional key: |
|
230 |
||
231 |
``type``
|
|
232 |
the type of message, either ``warning`` or ``error``. Any |
|
233 |
``error`` message indicates the import failed and was rolled back.
|
|
234 |
||
235 |
``message``
|
|
236 |
the message's actual text, which should be translated and can be |
|
237 |
shown to the user directly |
|
238 |
||
239 |
``rows``
|
|
240 |
a dict with 2 keys ``from`` and ``to``, indicates the range of |
|
241 |
rows in ``data`` which generated the message
|
|
242 |
||
243 |
``record``
|
|
244 |
a single integer, for warnings the index of the record which |
|
245 |
generated the message (can be obtained from a non-false ``ids``
|
|
246 |
result) |
|
247 |
||
248 |
``field``
|
|
249 |
the name of the (logical) OpenERP field for which the error or |
|
250 |
warning was generated |
|
251 |
||
252 |
``moreinfo`` (optional)
|
|
253 |
A string, a list or a dict, leading to more information about the |
|
254 |
warning. |
|
255 |
||
256 |
* If ``moreinfo`` is a string, it is a supplementary warnings |
|
257 |
message which should be hidden by default |
|
258 |
* If ``moreinfo`` is a list, it provides a number of possible or |
|
259 |
alternative values for the string |
|
260 |
* If ``moreinfo`` is a dict, it is an OpenERP action descriptor |
|
261 |
which can be executed to get more information about the issues |
|
262 |
with the field. If present, the ``help`` key serves as a label
|
|
263 |
for the action (e.g. the text of the link). |