244
244
=================
246
.. _faq_sql_expression_string:
248
How do I render SQL expressions as strings, possibly with bound parameters inlined?
249
------------------------------------------------------------------------------------
251
The "stringification" of a SQLAlchemy statement or Query in the vast majority
252
of cases is as simple as::
256
this applies both to an ORM :class:`~.orm.query.Query` as well as any :func:`.select` or other
257
statement. Additionally, to get the statement as compiled to a
258
specific dialect or engine, if the statement itself is not already
259
bound to one you can pass this in to :meth:`.ClauseElement.compile`::
261
print statement.compile(someengine)
263
or without an :class:`.Engine`::
265
from sqlalchemy.dialects import postgresql
266
print statement.compile(dialect=postgresql.dialect())
268
When given an ORM :class:`~.orm.query.Query` object, in order to get at the
269
:meth:`.ClauseElement.compile`
270
method we only need access the :attr:`~.orm.query.Query.statement`
273
statement = query.statement
274
print statement.compile(someengine)
276
The above forms will render the SQL statement as it is passed to the Python
277
:term:`DBAPI`, which includes that bound parameters are not rendered inline.
278
SQLAlchemy normally does not stringify bound parameters, as this is handled
279
appropriately by the Python DBAPI, not to mention bypassing bound
280
parameters is probably the most widely exploited security hole in
281
modern web applications. SQLAlchemy has limited ability to do this
282
stringification in certain circumstances such as that of emitting DDL.
283
In order to access this functionality one can use the ``literal_binds``
284
flag, passed to ``compile_kwargs``::
286
from sqlalchemy.sql import table, column, select
288
t = table('t', column('x'))
290
s = select([t]).where(t.c.x == 5)
292
print s.compile(compile_kwargs={"literal_binds": True})
294
the above approach has the caveats that it is only supported for basic
295
types, such as ints and strings, and furthermore if a :func:`.bindparam`
296
without a pre-set value is used directly, it won't be able to
297
stringify that either. Additionally, the ``compile_kwargs`` argument
298
itself is only available as of SQLAlchemy 0.9; and there are slightly
299
more verbose ways of getting ``literal_binds`` injected with 0.8.
301
.. topic:: Applying compiler kw arguments prior to 0.9
303
We can provide a fixed set of ``**kw`` by calling upon ``process()``
306
compiled = s.compile()
307
print compiled.process(s, literal_binds=True)
309
Note that in this approach the statement is actually being stringified
310
twice, hence using ``compile_kwargs`` in 0.9 should be preferred.
313
If we want to skip using ``literal_binds`` altogether due to the above
314
caveats, we can take the approach of replacing them out ahead of time
315
with whatever we want::
317
from sqlalchemy.sql import visitors, text
319
def replace_bindparam(element):
320
if hasattr(element, 'effective_value'):
321
return text(str(element.effective_value))
323
s = visitors.replacement_traverse(s, {}, replace_bindparam)
326
Still another approach to injecting functionality where bound
327
parameters are concerned is to use the :doc:`Compilation Extension
328
API <core/compiler>`::
330
from sqlalchemy.ext.compiler import compiles
331
from sqlalchemy.sql.expression import BindParameter
333
s = select([t]).where(t.c.x == 5)
335
@compiles(BindParameter)
336
def as_str(element, compiler, **kw):
337
if 'binds_as_str' in kw:
338
return str(element.effective_value)
340
return compiler.visit_bindparam(element, **kw)
342
print s.compile(compile_kwargs={"binds_as_str": True})
344
Above, we pass a self-defined flag ``binds_as_str`` through the compiler,
345
which we then intercept within our custom render method for :class:`.BindParameter`.
246
347
Why does ``.col.in_([])`` Produce ``col != col``? Why not ``1=0``?
247
348
-------------------------------------------------------------------
404
505
this differs from :attr:`.Mapper.mapped_table` in the case of a mapper mapped
405
506
using inheritance to a composed selectable.
508
.. _faq_combining_columns:
510
I'm getting a warning or error about "Implicitly combining column X under attribute Y"
511
--------------------------------------------------------------------------------------
513
This condition refers to when a mapping contains two columns that are being
514
mapped under the same attribute name due to their name, but there's no indication
515
that this is intentional. A mapped class needs to have explicit names for
516
every attribute that is to store an independent value; when two columns have the
517
same name and aren't disambiguated, they fall under the same attribute and
518
the effect is that the value from one column is **copied** into the other, based
519
on which column was assigned to the attribute first.
521
This behavior is often desirable and is allowed without warning in the case
522
where the two columns are linked together via a foreign key relationship
523
within an inheritance mapping. When the warning or exception occurs, the
524
issue can be resolved by either assigning the columns to differently-named
525
attributes, or if combining them together is desired, by using
526
:func:`.column_property` to make this explicit.
528
Given the example as follows::
530
from sqlalchemy import Integer, Column, ForeignKey
531
from sqlalchemy.ext.declarative import declarative_base
533
Base = declarative_base()
538
id = Column(Integer, primary_key=True)
543
id = Column(Integer, primary_key=True)
544
a_id = Column(Integer, ForeignKey('a.id'))
546
As of SQLAlchemy version 0.9.5, the above condition is detected, and will
547
warn that the ``id`` column of ``A`` and ``B`` is being combined under
548
the same-named attribute ``id``, which above is a serious issue since it means
549
that a ``B`` object's primary key will always mirror that of its ``A``.
551
A mapping which resolves this is as follows::
556
id = Column(Integer, primary_key=True)
561
b_id = Column('id', Integer, primary_key=True)
562
a_id = Column(Integer, ForeignKey('a.id'))
564
Suppose we did want ``A.id`` and ``B.id`` to be mirrors of each other, despite
565
the fact that ``B.a_id`` is where ``A.id`` is related. We could combine
566
them together using :func:`.column_property`::
571
id = Column(Integer, primary_key=True)
576
# probably not what you want, but this is a demonstration
577
id = column_property(Column(Integer, primary_key=True), A.id)
578
a_id = Column(Integer, ForeignKey('a.id'))
407
582
I'm using Declarative and setting primaryjoin/secondaryjoin using an ``and_()`` or ``or_()``, and I am getting an error message about foreign keys.
408
583
------------------------------------------------------------------------------------------------------------------------------------------------------------------
410
585
Are you doing this?::
415
foo = relationship("Dest", primaryjoin=and_("MyClass.id==Dest.foo_id", "MyClass.foo==Dest.bar"))
590
foo = relationship("Dest", primaryjoin=and_("MyClass.id==Dest.foo_id", "MyClass.foo==Dest.bar"))
417
592
That's an ``and_()`` of two string expressions, which SQLAlchemy cannot apply any mapping towards. Declarative allows :func:`.relationship` arguments to be specified as strings, which are converted into expression objects using ``eval()``. But this doesn't occur inside of an ``and_()`` expression - it's a special operation declarative applies only to the *entirety* of what's passed to primaryjoin or other arguments as a string::
422
foo = relationship("Dest", primaryjoin="and_(MyClass.id==Dest.foo_id, MyClass.foo==Dest.bar)")
597
foo = relationship("Dest", primaryjoin="and_(MyClass.id==Dest.foo_id, MyClass.foo==Dest.bar)")
424
599
Or if the objects you need are already available, skip the strings::
429
foo = relationship(Dest, primaryjoin=and_(MyClass.id==Dest.foo_id, MyClass.foo==Dest.bar))
604
foo = relationship(Dest, primaryjoin=and_(MyClass.id==Dest.foo_id, MyClass.foo==Dest.bar))
431
606
The same idea applies to all the other arguments, such as ``foreign_keys``::
434
foo = relationship(Dest, foreign_keys=["Dest.foo_id", "Dest.bar_id"])
437
foo = relationship(Dest, foreign_keys="[Dest.foo_id, Dest.bar_id]")
440
foo = relationship(Dest, foreign_keys=[Dest.foo_id, Dest.bar_id])
442
# if you're using columns from the class that you're inside of, just use the column objects !
448
foo = relationship(Dest, foreign_keys=[foo_id, bar_id])
609
foo = relationship(Dest, foreign_keys=["Dest.foo_id", "Dest.bar_id"])
612
foo = relationship(Dest, foreign_keys="[Dest.foo_id, Dest.bar_id]")
615
foo = relationship(Dest, foreign_keys=[Dest.foo_id, Dest.bar_id])
617
# if you're using columns from the class that you're inside of, just use the column objects !
623
foo = relationship(Dest, foreign_keys=[foo_id, bar_id])
628
How can I profile a SQLAlchemy powered application?
629
---------------------------------------------------
631
Looking for performance issues typically involves two stratgies. One
632
is query profiling, and the other is code profiling.
637
Sometimes just plain SQL logging (enabled via python's logging module
638
or via the ``echo=True`` argument on :func:`.create_engine`) can give an
639
idea how long things are taking. For example, if you log something
640
right after a SQL operation, you'd see something like this in your
643
17:37:48,325 INFO [sqlalchemy.engine.base.Engine.0x...048c] SELECT ...
644
17:37:48,326 INFO [sqlalchemy.engine.base.Engine.0x...048c] {<params>}
645
17:37:48,660 DEBUG [myapp.somemessage]
647
if you logged ``myapp.somemessage`` right after the operation, you know
648
it took 334ms to complete the SQL part of things.
650
Logging SQL will also illustrate if dozens/hundreds of queries are
651
being issued which could be better organized into much fewer queries.
652
When using the SQLAlchemy ORM, the "eager loading"
653
feature is provided to partially (:func:`.contains_eager()`) or fully
654
(:func:`.joinedload()`, :func:`.subqueryload()`)
655
automate this activity, but without
656
the ORM "eager loading" typically means to use joins so that results across multiple
657
tables can be loaded in one result set instead of multiplying numbers
658
of queries as more depth is added (i.e. ``r + r*r2 + r*r2*r3`` ...)
660
For more long-term profiling of queries, or to implement an application-side
661
"slow query" monitor, events can be used to intercept cursor executions,
662
using a recipe like the following::
664
from sqlalchemy import event
665
from sqlalchemy.engine import Engine
669
logging.basicConfig()
670
logger = logging.getLogger("myapp.sqltime")
671
logger.setLevel(logging.DEBUG)
673
@event.listens_for(Engine, "before_cursor_execute")
674
def before_cursor_execute(conn, cursor, statement,
675
parameters, context, executemany):
676
conn.info.setdefault('query_start_time', []).append(time.time())
677
logger.debug("Start Query: %s" % statement)
679
@event.listens_for(Engine, "after_cursor_execute")
680
def after_cursor_execute(conn, cursor, statement,
681
parameters, context, executemany):
682
total = time.time() - conn.info['query_start_time'].pop(-1)
683
logger.debug("Query Complete!")
684
logger.debug("Total Time: %f" % total)
686
Above, we use the :meth:`.ConnectionEvents.before_cursor_execute` and
687
:meth:`.ConnectionEvents.after_cursor_execute` events to establish an interception
688
point around when a statement is executed. We attach a timer onto the
689
connection using the :class:`._ConnectionRecord.info` dictionary; we use a
690
stack here for the occasional case where the cursor execute events may be nested.
695
If logging reveals that individual queries are taking too long, you'd
696
need a breakdown of how much time was spent within the database
697
processing the query, sending results over the network, being handled
698
by the :term:`DBAPI`, and finally being received by SQLAlchemy's result set
699
and/or ORM layer. Each of these stages can present their own
700
individual bottlenecks, depending on specifics.
702
For that you need to use the
703
`Python Profiling Module <https://docs.python.org/2/library/profile.html>`_.
704
Below is a simple recipe which works profiling into a context manager::
711
@contextlib.contextmanager
713
pr = cProfile.Profile()
717
s = StringIO.StringIO()
718
ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
720
# uncomment this to see who's calling what
724
To profile a section of code::
727
Session.query(FooClass).filter(FooClass.somevalue==8).all()
729
The output of profiling can be used to give an idea where time is
730
being spent. A section of profiling output looks like this::
732
13726 function calls (13042 primitive calls) in 0.014 seconds
734
Ordered by: cumulative time
736
ncalls tottime percall cumtime percall filename:lineno(function)
737
222/21 0.001 0.000 0.011 0.001 lib/sqlalchemy/orm/loading.py:26(instances)
738
220/20 0.002 0.000 0.010 0.001 lib/sqlalchemy/orm/loading.py:327(_instance)
739
220/20 0.000 0.000 0.010 0.000 lib/sqlalchemy/orm/loading.py:284(populate_state)
740
20 0.000 0.000 0.010 0.000 lib/sqlalchemy/orm/strategies.py:987(load_collection_from_subq)
741
20 0.000 0.000 0.009 0.000 lib/sqlalchemy/orm/strategies.py:935(get)
742
1 0.000 0.000 0.009 0.009 lib/sqlalchemy/orm/strategies.py:940(_load)
743
21 0.000 0.000 0.008 0.000 lib/sqlalchemy/orm/strategies.py:942(<genexpr>)
744
2 0.000 0.000 0.004 0.002 lib/sqlalchemy/orm/query.py:2400(__iter__)
745
2 0.000 0.000 0.002 0.001 lib/sqlalchemy/orm/query.py:2414(_execute_and_instances)
746
2 0.000 0.000 0.002 0.001 lib/sqlalchemy/engine/base.py:659(execute)
747
2 0.000 0.000 0.002 0.001 lib/sqlalchemy/sql/elements.py:321(_execute_on_connection)
748
2 0.000 0.000 0.002 0.001 lib/sqlalchemy/engine/base.py:788(_execute_clauseelement)
752
Above, we can see that the ``instances()`` SQLAlchemy function was called 222
753
times (recursively, and 21 times from the outside), taking a total of .011
754
seconds for all calls combined.
759
The specifics of these calls can tell us where the time is being spent.
760
If for example, you see time being spent within ``cursor.execute()``,
761
e.g. against the DBAPI::
763
2 0.102 0.102 0.204 0.102 {method 'execute' of 'sqlite3.Cursor' objects}
765
this would indicate that the database is taking a long time to start returning
766
results, and it means your query should be optimized, either by adding indexes
767
or restructuring the query and/or underlying schema. For that task,
768
analysis of the query plan is warranted, using a system such as EXPLAIN,
769
SHOW PLAN, etc. as is provided by the database backend.
771
Result Fetching Slowness - Core
772
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
774
If on the other hand you see many thousands of calls related to fetching rows,
775
or very long calls to ``fetchall()``, it may
776
mean your query is returning more rows than expected, or that the fetching
777
of rows itself is slow. The ORM itself typically uses ``fetchall()`` to fetch
778
rows (or ``fetchmany()`` if the :meth:`.Query.yield_per` option is used).
780
An inordinately large number of rows would be indicated
781
by a very slow call to ``fetchall()`` at the DBAPI level::
783
2 0.300 0.600 0.300 0.600 {method 'fetchall' of 'sqlite3.Cursor' objects}
785
An unexpectedly large number of rows, even if the ultimate result doesn't seem
786
to have many rows, can be the result of a cartesian product - when multiple
787
sets of rows are combined together without appropriately joining the tables
788
together. It's often easy to produce this behavior with SQLAlchemy Core or
789
ORM query if the wrong :class:`.Column` objects are used in a complex query,
790
pulling in additional FROM clauses that are unexpected.
792
On the other hand, a fast call to ``fetchall()`` at the DBAPI level, but then
793
slowness when SQLAlchemy's :class:`.ResultProxy` is asked to do a ``fetchall()``,
794
may indicate slowness in processing of datatypes, such as unicode conversions
797
# the DBAPI cursor is fast...
798
2 0.020 0.040 0.020 0.040 {method 'fetchall' of 'sqlite3.Cursor' objects}
802
# but SQLAlchemy's result proxy is slow, this is type-level processing
803
2 0.100 0.200 0.100 0.200 lib/sqlalchemy/engine/result.py:778(fetchall)
805
In some cases, a backend might be doing type-level processing that isn't
806
needed. More specifically, seeing calls within the type API that are slow
807
are better indicators - below is what it looks like when we use a type like
810
from sqlalchemy import TypeDecorator
813
class Foo(TypeDecorator):
816
def process_result_value(self, value, thing):
817
# intentionally add slowness for illustration purposes
821
the profiling output of this intentionally slow operation can be seen like this::
823
200 0.001 0.000 0.237 0.001 lib/sqlalchemy/sql/type_api.py:911(process)
824
200 0.001 0.000 0.236 0.001 test.py:28(process_result_value)
825
200 0.235 0.001 0.235 0.001 {time.sleep}
827
that is, we see many expensive calls within the ``type_api`` system, and the actual
828
time consuming thing is the ``time.sleep()`` call.
830
Make sure to check the :doc:`Dialect documentation <dialects/index>`
831
for notes on known performance tuning suggestions at this level, especially for
832
databases like Oracle. There may be systems related to ensuring numeric accuracy
833
or string processing that may not be needed in all cases.
835
There also may be even more low-level points at which row-fetching performance is suffering;
836
for example, if time spent seems to focus on a call like ``socket.receive()``,
837
that could indicate that everything is fast except for the actual network connection,
838
and too much time is spent with data moving over the network.
840
Result Fetching Slowness - ORM
841
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
843
To detect slowness in ORM fetching of rows (which is the most common area
844
of performance concern), calls like ``populate_state()`` and ``_instance()`` will
845
illustrate individual ORM object populations::
847
# the ORM calls _instance for each ORM-loaded row it sees, and
848
# populate_state for each ORM-loaded row that results in the population
849
# of an object's attributes
850
220/20 0.001 0.000 0.010 0.000 lib/sqlalchemy/orm/loading.py:327(_instance)
851
220/20 0.000 0.000 0.009 0.000 lib/sqlalchemy/orm/loading.py:284(populate_state)
853
The ORM's slowness in turning rows into ORM-mapped objects is a product
854
of the complexity of this operation combined with the overhead of cPython.
855
Common strategies to mitigate this include:
857
* fetch individual columns instead of full entities, that is::
859
session.query(User.id, User.name)
865
* Use :class:`.Bundle` objects to organize column-based results::
867
u_b = Bundle('user', User.id, User.name)
868
a_b = Bundle('address', Address.id, Address.email)
870
for user, address in session.query(u_b, a_b).join(User.addresses):
873
* Use result caching - see :ref:`examples_caching` for an in-depth example
876
* Consider a faster interpreter like that of Pypy.
878
The output of a profile can be a little daunting but after some
879
practice they are very easy to read.
881
If you're feeling ambitious, there's also a more involved example of
882
SQLAlchemy profiling within the SQLAlchemy unit tests in the
883
``tests/aaa_profiling`` section. Tests in this area
884
use decorators that assert a
885
maximum number of method calls being used for particular operations,
886
so that if something inefficient gets checked in, the tests will
887
reveal it (it is important to note that in cPython, function calls have
888
the highest overhead of any operation, and the count of calls is more
889
often than not nearly proportional to time spent). Of note are the
890
the "zoomark" tests which use a fancy "SQL capturing" scheme which
891
cuts out the overhead of the DBAPI from the equation - although that
892
technique isn't really necessary for garden-variety profiling.
894
I'm inserting 400,000 rows with the ORM and it's really slow!
895
--------------------------------------------------------------
897
The SQLAlchemy ORM uses the :term:`unit of work` pattern when synchronizing
898
changes to the database. This pattern goes far beyond simple "inserts"
899
of data. It includes that attributes which are assigned on objects are
900
received using an attribute instrumentation system which tracks
901
changes on objects as they are made, includes that all rows inserted
902
are tracked in an identity map which has the effect that for each row
903
SQLAlchemy must retrieve its "last inserted id" if not already given,
904
and also involves that rows to be inserted are scanned and sorted for
905
dependencies as needed. Objects are also subject to a fair degree of
906
bookkeeping in order to keep all of this running, which for a very
907
large number of rows at once can create an inordinate amount of time
908
spent with large data structures, hence it's best to chunk these.
910
Basically, unit of work is a large degree of automation in order to
911
automate the task of persisting a complex object graph into a
912
relational database with no explicit persistence code, and this
913
automation has a price.
915
ORMs are basically not intended for high-performance bulk inserts -
916
this is the whole reason SQLAlchemy offers the Core in addition to the
917
ORM as a first-class component.
919
For the use case of fast bulk inserts, the
920
SQL generation and execution system that the ORM builds on top of
921
is part of the Core. Using this system directly, we can produce an INSERT that
922
is competitive with using the raw database API directly.
924
The example below illustrates time-based tests for four different
925
methods of inserting rows, going from the most automated to the least.
926
With cPython 2.7, runtimes observed::
928
classics-MacBook-Pro:sqlalchemy classic$ python test.py
929
SQLAlchemy ORM: Total time for 100000 records 14.3528850079 secs
930
SQLAlchemy ORM pk given: Total time for 100000 records 10.0164160728 secs
931
SQLAlchemy Core: Total time for 100000 records 0.775382995605 secs
932
sqlite3: Total time for 100000 records 0.676795005798 sec
934
We can reduce the time by a factor of three using recent versions of `Pypy <http://pypy.org/>`_::
936
classics-MacBook-Pro:sqlalchemy classic$ /usr/local/src/pypy-2.1-beta2-osx64/bin/pypy test.py
937
SQLAlchemy ORM: Total time for 100000 records 5.88369488716 secs
938
SQLAlchemy ORM pk given: Total time for 100000 records 3.52294301987 secs
939
SQLAlchemy Core: Total time for 100000 records 0.613556146622 secs
940
sqlite3: Total time for 100000 records 0.442467927933 sec
947
from sqlalchemy.ext.declarative import declarative_base
948
from sqlalchemy import Column, Integer, String, create_engine
949
from sqlalchemy.orm import scoped_session, sessionmaker
951
Base = declarative_base()
952
DBSession = scoped_session(sessionmaker())
955
class Customer(Base):
956
__tablename__ = "customer"
957
id = Column(Integer, primary_key=True)
958
name = Column(String(255))
960
def init_sqlalchemy(dbname='sqlite:///sqlalchemy.db'):
962
engine = create_engine(dbname, echo=False)
964
DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
965
Base.metadata.drop_all(engine)
966
Base.metadata.create_all(engine)
968
def test_sqlalchemy_orm(n=100000):
972
customer = Customer()
973
customer.name = 'NAME ' + str(i)
974
DBSession.add(customer)
978
print("SQLAlchemy ORM: Total time for " + str(n) +
979
" records " + str(time.time() - t0) + " secs")
981
def test_sqlalchemy_orm_pk_given(n=100000):
985
customer = Customer(id=i+1, name="NAME " + str(i))
986
DBSession.add(customer)
990
print("SQLAlchemy ORM pk given: Total time for " + str(n) +
991
" records " + str(time.time() - t0) + " secs")
993
def test_sqlalchemy_core(n=100000):
997
Customer.__table__.insert(),
998
[{"name": 'NAME ' + str(i)} for i in range(n)]
1000
print("SQLAlchemy Core: Total time for " + str(n) +
1001
" records " + str(time.time() - t0) + " secs")
1003
def init_sqlite3(dbname):
1004
conn = sqlite3.connect(dbname)
1006
c.execute("DROP TABLE IF EXISTS customer")
1007
c.execute("CREATE TABLE customer (id INTEGER NOT NULL, "
1008
"name VARCHAR(255), PRIMARY KEY(id))")
1012
def test_sqlite3(n=100000, dbname='sqlite3.db'):
1013
conn = init_sqlite3(dbname)
1017
row = ('NAME ' + str(i),)
1018
c.execute("INSERT INTO customer (name) VALUES (?)", row)
1020
print("sqlite3: Total time for " + str(n) +
1021
" records " + str(time.time() - t0) + " sec")
1023
if __name__ == '__main__':
1024
test_sqlalchemy_orm(100000)
1025
test_sqlalchemy_orm_pk_given(100000)
1026
test_sqlalchemy_core(100000)
1027
test_sqlite3(100000)
451
1031
Sessions / Queries
452
1032
===================
454
1035
"This Session's transaction has been rolled back due to a previous exception during flush." (or similar)
455
1036
---------------------------------------------------------------------------------------------------------
621
1202
starting/ending the "frame" is kept consistently with the code external to the
622
1203
``flush()``, and we made a decision that this was the most consistent approach.
624
I'm inserting 400,000 rows with the ORM and it's really slow!
625
--------------------------------------------------------------
627
The SQLAlchemy ORM uses the :term:`unit of work` pattern when synchronizing
628
changes to the database. This pattern goes far beyond simple "inserts"
629
of data. It includes that attributes which are assigned on objects are
630
received using an attribute instrumentation system which tracks
631
changes on objects as they are made, includes that all rows inserted
632
are tracked in an identity map which has the effect that for each row
633
SQLAlchemy must retrieve its "last inserted id" if not already given,
634
and also involves that rows to be inserted are scanned and sorted for
635
dependencies as needed. Objects are also subject to a fair degree of
636
bookkeeping in order to keep all of this running, which for a very
637
large number of rows at once can create an inordinate amount of time
638
spent with large data structures, hence it's best to chunk these.
640
Basically, unit of work is a large degree of automation in order to
641
automate the task of persisting a complex object graph into a
642
relational database with no explicit persistence code, and this
643
automation has a price.
645
ORMs are basically not intended for high-performance bulk inserts -
646
this is the whole reason SQLAlchemy offers the Core in addition to the
647
ORM as a first-class component.
649
For the use case of fast bulk inserts, the
650
SQL generation and execution system that the ORM builds on top of
651
is part of the Core. Using this system directly, we can produce an INSERT that
652
is competitive with using the raw database API directly.
654
The example below illustrates time-based tests for four different
655
methods of inserting rows, going from the most automated to the least.
656
With cPython 2.7, runtimes observed::
658
classics-MacBook-Pro:sqlalchemy classic$ python test.py
659
SQLAlchemy ORM: Total time for 100000 records 14.3528850079 secs
660
SQLAlchemy ORM pk given: Total time for 100000 records 10.0164160728 secs
661
SQLAlchemy Core: Total time for 100000 records 0.775382995605 secs
662
sqlite3: Total time for 100000 records 0.676795005798 sec
664
We can reduce the time by a factor of three using recent versions of `Pypy <http://pypy.org/>`_::
666
classics-MacBook-Pro:sqlalchemy classic$ /usr/local/src/pypy-2.1-beta2-osx64/bin/pypy test.py
667
SQLAlchemy ORM: Total time for 100000 records 5.88369488716 secs
668
SQLAlchemy ORM pk given: Total time for 100000 records 3.52294301987 secs
669
SQLAlchemy Core: Total time for 100000 records 0.613556146622 secs
670
sqlite3: Total time for 100000 records 0.442467927933 sec
677
from sqlalchemy.ext.declarative import declarative_base
678
from sqlalchemy import Column, Integer, String, create_engine
679
from sqlalchemy.orm import scoped_session, sessionmaker
681
Base = declarative_base()
682
DBSession = scoped_session(sessionmaker())
685
class Customer(Base):
686
__tablename__ = "customer"
687
id = Column(Integer, primary_key=True)
688
name = Column(String(255))
690
def init_sqlalchemy(dbname='sqlite:///sqlalchemy.db'):
692
engine = create_engine(dbname, echo=False)
694
DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
695
Base.metadata.drop_all(engine)
696
Base.metadata.create_all(engine)
698
def test_sqlalchemy_orm(n=100000):
702
customer = Customer()
703
customer.name = 'NAME ' + str(i)
704
DBSession.add(customer)
708
print("SQLAlchemy ORM: Total time for " + str(n) +
709
" records " + str(time.time() - t0) + " secs")
711
def test_sqlalchemy_orm_pk_given(n=100000):
715
customer = Customer(id=i+1, name="NAME " + str(i))
716
DBSession.add(customer)
720
print("SQLAlchemy ORM pk given: Total time for " + str(n) +
721
" records " + str(time.time() - t0) + " secs")
723
def test_sqlalchemy_core(n=100000):
727
Customer.__table__.insert(),
728
[{"name": 'NAME ' + str(i)} for i in range(n)]
730
print("SQLAlchemy Core: Total time for " + str(n) +
731
" records " + str(time.time() - t0) + " secs")
733
def init_sqlite3(dbname):
734
conn = sqlite3.connect(dbname)
736
c.execute("DROP TABLE IF EXISTS customer")
737
c.execute("CREATE TABLE customer (id INTEGER NOT NULL, "
738
"name VARCHAR(255), PRIMARY KEY(id))")
742
def test_sqlite3(n=100000, dbname='sqlite3.db'):
743
conn = init_sqlite3(dbname)
747
row = ('NAME ' + str(i),)
748
c.execute("INSERT INTO customer (name) VALUES (?)", row)
750
print("sqlite3: Total time for " + str(n) +
751
" records " + str(time.time() - t0) + " sec")
753
if __name__ == '__main__':
754
test_sqlalchemy_orm(100000)
755
test_sqlalchemy_orm_pk_given(100000)
756
test_sqlalchemy_core(100000)
761
1207
How do I make a Query that always adds a certain filter to every query?
850
1296
relationships naturally. Therefore, the recommended way to
851
1297
set ``o.foo`` is to do just that - set it!::
853
foo = Session.query(Foo).get(7)
1299
foo = Session.query(Foo).get(7)
857
1303
Manipulation of foreign key attributes is of course entirely legal. However,
858
1304
setting a foreign-key attribute to a new value currently does not trigger
859
1305
an "expire" event of the :func:`.relationship` in which it's involved. This means
860
1306
that for the following sequence::
862
o = Session.query(SomeClass).first()
863
assert o.foo is None # accessing an un-set attribute sets it to None
1308
o = Session.query(SomeClass).first()
1309
assert o.foo is None # accessing an un-set attribute sets it to None
866
1312
``o.foo`` is initialized to ``None`` when we first accessed it. Setting
867
1313
``o.foo_id = 7`` will have the value of "7" as pending, but no flush
868
1314
has occurred - so ``o.foo`` is still ``None``::
870
# attribute is already set to None, has not been
871
# reconciled with o.foo_id = 7 yet
1316
# attribute is already set to None, has not been
1317
# reconciled with o.foo_id = 7 yet
1318
assert o.foo is None
874
1320
For ``o.foo`` to load based on the foreign key mutation is usually achieved
875
1321
naturally after the commit, which both flushes the new foreign key value
876
1322
and expires all state::
878
Session.commit() # expires all attributes
880
foo_7 = Session.query(Foo).get(7)
882
assert o.foo is foo_7 # o.foo lazyloads on access
1324
Session.commit() # expires all attributes
1326
foo_7 = Session.query(Foo).get(7)
1328
assert o.foo is foo_7 # o.foo lazyloads on access
884
1330
A more minimal operation is to expire the attribute individually - this can
885
1331
be performed for any :term:`persistent` object using :meth:`.Session.expire`::
887
o = Session.query(SomeClass).first()
889
Session.expire(o, ['foo']) # object must be persistent for this
891
foo_7 = Session.query(Foo).get(7)
893
assert o.foo is foo_7 # o.foo lazyloads on access
1333
o = Session.query(SomeClass).first()
1335
Session.expire(o, ['foo']) # object must be persistent for this
1337
foo_7 = Session.query(Foo).get(7)
1339
assert o.foo is foo_7 # o.foo lazyloads on access
895
1341
Note that if the object is not persistent but present in the :class:`.Session`,
896
1342
it's known as :term:`pending`. This means the row for the object has not been
897
1343
INSERTed into the database yet. For such an object, setting ``foo_id`` does not
898
1344
have meaning until the row is inserted; otherwise there is no row yet::
900
new_obj = SomeClass()
905
# accessing an un-set attribute sets it to None
906
assert new_obj.foo is None
908
Session.flush() # emits INSERT
910
# expire this because we already set .foo to None
911
Session.expire(o, ['foo'])
913
assert new_obj.foo is foo_7 # now it loads
1346
new_obj = SomeClass()
1349
Session.add(new_obj)
1351
# accessing an un-set attribute sets it to None
1352
assert new_obj.foo is None
1354
Session.flush() # emits INSERT
1356
# expire this because we already set .foo to None
1357
Session.expire(o, ['foo'])
1359
assert new_obj.foo is foo_7 # now it loads
916
1362
.. topic:: Attribute loading for non-persistent objects
918
One variant on the "pending" behavior above is if we use the flag
919
``load_on_pending`` on :func:`.relationship`. When this flag is set, the
920
lazy loader will emit for ``new_obj.foo`` before the INSERT proceeds; another
921
variant of this is to use the :meth:`.Session.enable_relationship_loading`
922
method, which can "attach" an object to a :class:`.Session` in such a way that
923
many-to-one relationships load as according to foreign key attributes
924
regardless of the object being in any particular state.
925
Both techniques are **not recommended for general use**; they were added to suit
926
specfic programming scenarios encountered by users which involve the repurposing
927
of the ORM's usual object states.
1364
One variant on the "pending" behavior above is if we use the flag
1365
``load_on_pending`` on :func:`.relationship`. When this flag is set, the
1366
lazy loader will emit for ``new_obj.foo`` before the INSERT proceeds; another
1367
variant of this is to use the :meth:`.Session.enable_relationship_loading`
1368
method, which can "attach" an object to a :class:`.Session` in such a way that
1369
many-to-one relationships load as according to foreign key attributes
1370
regardless of the object being in any particular state.
1371
Both techniques are **not recommended for general use**; they were added to suit
1372
specific programming scenarios encountered by users which involve the repurposing
1373
of the ORM's usual object states.
929
1375
The recipe `ExpireRelationshipOnFKChange <http://www.sqlalchemy.org/trac/wiki/UsageRecipes/ExpireRelationshipOnFKChange>`_ features an example using SQLAlchemy events
930
1376
in order to coordinate the setting of foreign key attributes with many-to-one