2
/* Dictionary object implementation using a hash table */
4
/* The distribution includes a separate file, Objects/dictnotes.txt,
5
describing explorations into dictionary design and optimization.
6
It covers typical dictionary use patterns, the parameters for
7
tuning dictionaries, and several ideas for possible optimizations.
11
#include "stringlib/eq.h"
14
/* Set a key error with the specified argument, wrapping it in a
15
* tuple automatically so that tuple keys are not unpacked as the
16
* exception arguments. */
18
set_key_error(PyObject *arg)
21
tup = PyTuple_Pack(1, arg);
23
return; /* caller will expect error to be set anyway */
24
PyErr_SetObject(PyExc_KeyError, tup);
28
/* Define this out if you don't want conversion statistics on exit. */
29
#undef SHOW_CONVERSION_COUNTS
31
/* See large comment block below. This must be >= 1. */
32
#define PERTURB_SHIFT 5
35
Major subtleties ahead: Most hash schemes depend on having a "good" hash
36
function, in the sense of simulating randomness. Python doesn't: its most
37
important hash functions (for strings and ints) are very regular in common
40
>>> map(hash, (0, 1, 2, 3))
42
>>> map(hash, ("namea", "nameb", "namec", "named"))
43
[-1658398457, -1658398460, -1658398459, -1658398462]
46
This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
47
the low-order i bits as the initial table index is extremely fast, and there
48
are no collisions at all for dicts indexed by a contiguous range of ints.
49
The same is approximately true when keys are "consecutive" strings. So this
50
gives better-than-random behavior in common cases, and that's very desirable.
52
OTOH, when collisions occur, the tendency to fill contiguous slices of the
53
hash table makes a good collision resolution strategy crucial. Taking only
54
the last i bits of the hash code is also vulnerable: for example, consider
55
the list [i << 16 for i in range(20000)] as a set of keys. Since ints are
56
their own hash codes, and this fits in a dict of size 2**15, the last 15 bits
57
of every hash code are all 0: they *all* map to the same table index.
59
But catering to unusual cases should not slow the usual ones, so we just take
60
the last i bits anyway. It's up to collision resolution to do the rest. If
61
we *usually* find the key we're looking for on the first try (and, it turns
62
out, we usually do -- the table load factor is kept under 2/3, so the odds
63
are solidly in our favor), then it makes best sense to keep the initial index
64
computation dirt cheap.
66
The first half of collision resolution is to visit table indices via this
69
j = ((5*j) + 1) mod 2**i
71
For any initial j in range(2**i), repeating that 2**i times generates each
72
int in range(2**i) exactly once (see any text on random-number generation for
73
proof). By itself, this doesn't help much: like linear probing (setting
74
j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed
75
order. This would be bad, except that's not the only thing we do, and it's
76
actually *good* in the common cases where hash keys are consecutive. In an
77
example that's really too small to make this entirely clear, for a table of
78
size 2**3 the order of indices is:
80
0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]
82
If two things come in at index 5, the first place we look after is index 2,
83
not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.
84
Linear probing is deadly in this case because there the fixed probe order
85
is the *same* as the order consecutive keys are likely to arrive. But it's
86
extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,
87
and certain that consecutive hash codes do not.
89
The other half of the strategy is to get the other bits of the hash code
90
into play. This is done by initializing a (unsigned) vrbl "perturb" to the
91
full hash code, and changing the recurrence to:
93
j = (5*j) + 1 + perturb;
94
perturb >>= PERTURB_SHIFT;
95
use j % 2**i as the next table index;
97
Now the probe sequence depends (eventually) on every bit in the hash code,
98
and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,
99
because it quickly magnifies small differences in the bits that didn't affect
100
the initial index. Note that because perturb is unsigned, if the recurrence
101
is executed often enough perturb eventually becomes and remains 0. At that
102
point (very rarely reached) the recurrence is on (just) 5*j+1 again, and
103
that's certain to find an empty slot eventually (since it generates every int
104
in range(2**i), and we make sure there's always at least one empty slot).
106
Selecting a good value for PERTURB_SHIFT is a balancing act. You want it
107
small so that the high bits of the hash code continue to affect the probe
108
sequence across iterations; but you want it large so that in really bad cases
109
the high-order hash bits have an effect on early iterations. 5 was "the
110
best" in minimizing total collisions across experiments Tim Peters ran (on
111
both normal and pathological cases), but 4 and 6 weren't significantly worse.
113
Historical: Reimer Behrends contributed the idea of using a polynomial-based
114
approach, using repeated multiplication by x in GF(2**n) where an irreducible
115
polynomial for each table size was chosen such that x was a primitive root.
116
Christian Tismer later extended that to use division by x instead, as an
117
efficient way to get the high bits of the hash code into play. This scheme
118
also gave excellent collision statistics, but was more expensive: two
119
if-tests were required inside the loop; computing "the next" index took about
120
the same number of operations but without as much potential parallelism
121
(e.g., computing 5*j can go on at the same time as computing 1+perturb in the
122
above, and then shifting perturb can be done while the table index is being
123
masked); and the PyDictObject struct required a member to hold the table's
124
polynomial. In Tim's experiments the current scheme ran faster, produced
125
equally good collision statistics, needed less code & used less memory.
127
Theoretical Python 2.5 headache: hash codes are only C "long", but
128
sizeof(Py_ssize_t) > sizeof(long) may be possible. In that case, and if a
129
dict is genuinely huge, then only the slots directly reachable via indexing
130
by a C long can be the first slot in a probe sequence. The probe sequence
131
will still eventually reach every slot in the table, but the collision rate
132
on initial probes may be much higher than this scheme was designed for.
133
Getting a hash code as fat as Py_ssize_t is the only real cure. But in
134
practice, this probably won't make a lick of difference for many years (at
135
which point everyone will have terabytes of RAM on 64-bit boxes).
138
/* Object used as dummy key to fill deleted entries */
139
static PyObject *dummy = NULL; /* Initialized by first call to newPyDictObject() */
149
/* forward declarations */
151
lookdict_unicode(PyDictObject *mp, PyObject *key, long hash);
153
#ifdef SHOW_CONVERSION_COUNTS
154
static long created = 0L;
155
static long converted = 0L;
160
fprintf(stderr, "created %ld string dicts\n", created);
161
fprintf(stderr, "converted %ld to normal dicts\n", converted);
162
fprintf(stderr, "%.2f%% conversion rate\n", (100.0*converted)/created);
166
/* Debug statistic to compare allocations with reuse through the free list */
167
#undef SHOW_ALLOC_COUNT
168
#ifdef SHOW_ALLOC_COUNT
169
static size_t count_alloc = 0;
170
static size_t count_reuse = 0;
175
fprintf(stderr, "Dict allocations: %" PY_FORMAT_SIZE_T "d\n",
177
fprintf(stderr, "Dict reuse through freelist: %" PY_FORMAT_SIZE_T
179
fprintf(stderr, "%.2f%% reuse rate\n\n",
180
(100.0*count_reuse/(count_alloc+count_reuse)));
184
/* Initialization macros.
185
There are two ways to create a dict: PyDict_New() is the main C API
186
function, and the tp_new slot maps to dict_new(). In the latter case we
187
can save a little time over what PyDict_New does because it's guaranteed
188
that the PyDictObject struct is already zeroed out.
189
Everyone except dict_new() should use EMPTY_TO_MINSIZE (unless they have
190
an excellent reason not to).
193
#define INIT_NONZERO_DICT_SLOTS(mp) do { \
194
(mp)->ma_table = (mp)->ma_smalltable; \
195
(mp)->ma_mask = PyDict_MINSIZE - 1; \
198
#define EMPTY_TO_MINSIZE(mp) do { \
199
memset((mp)->ma_smalltable, 0, sizeof((mp)->ma_smalltable)); \
200
(mp)->ma_used = (mp)->ma_fill = 0; \
201
INIT_NONZERO_DICT_SLOTS(mp); \
204
/* Dictionary reuse scheme to save calls to malloc, free, and memset */
205
#ifndef PyDict_MAXFREELIST
206
#define PyDict_MAXFREELIST 80
208
static PyDictObject *free_list[PyDict_MAXFREELIST];
209
static int numfree = 0;
217
op = free_list[--numfree];
218
assert(PyDict_CheckExact(op));
226
register PyDictObject *mp;
227
if (dummy == NULL) { /* Auto-initialize dummy */
228
dummy = PyUnicode_FromString("<dummy key>");
231
#ifdef SHOW_CONVERSION_COUNTS
232
Py_AtExit(show_counts);
234
#ifdef SHOW_ALLOC_COUNT
235
Py_AtExit(show_alloc);
239
mp = free_list[--numfree];
241
assert (Py_TYPE(mp) == &PyDict_Type);
242
_Py_NewReference((PyObject *)mp);
244
EMPTY_TO_MINSIZE(mp);
246
/* At least set ma_table and ma_mask; these are wrong
247
if an empty but presized dict is added to freelist */
248
INIT_NONZERO_DICT_SLOTS(mp);
250
assert (mp->ma_used == 0);
251
assert (mp->ma_table == mp->ma_smalltable);
252
assert (mp->ma_mask == PyDict_MINSIZE - 1);
253
#ifdef SHOW_ALLOC_COUNT
257
mp = PyObject_GC_New(PyDictObject, &PyDict_Type);
260
EMPTY_TO_MINSIZE(mp);
261
#ifdef SHOW_ALLOC_COUNT
265
mp->ma_lookup = lookdict_unicode;
266
#ifdef SHOW_CONVERSION_COUNTS
269
_PyObject_GC_TRACK(mp);
270
return (PyObject *)mp;
274
The basic lookup function used by all operations.
275
This is based on Algorithm D from Knuth Vol. 3, Sec. 6.4.
276
Open addressing is preferred over chaining since the link overhead for
277
chaining would be substantial (100% with typical malloc overhead).
279
The initial probe index is computed as hash mod the table size. Subsequent
280
probe indices are computed as explained earlier.
282
All arithmetic on hash should ignore overflow.
284
The details in this version are due to Tim Peters, building on many past
285
contributions by Reimer Behrends, Jyrki Alakuijala, Vladimir Marangozov and
288
lookdict() is general-purpose, and may return NULL if (and only if) a
289
comparison raises an exception (this was new in Python 2.5).
290
lookdict_unicode() below is specialized to string keys, comparison of which can
291
never raise an exception; that function can never return NULL. For both, when
292
the key isn't found a PyDictEntry* is returned for which the me_value field is
293
NULL; this is the slot in the dict at which the key would have been found, and
294
the caller can (if it wishes) add the <key, value> pair to the returned
298
lookdict(PyDictObject *mp, PyObject *key, register long hash)
301
register size_t perturb;
302
register PyDictEntry *freeslot;
303
register size_t mask = (size_t)mp->ma_mask;
304
PyDictEntry *ep0 = mp->ma_table;
305
register PyDictEntry *ep;
309
i = (size_t)hash & mask;
311
if (ep->me_key == NULL || ep->me_key == key)
314
if (ep->me_key == dummy)
317
if (ep->me_hash == hash) {
318
startkey = ep->me_key;
320
cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
324
if (ep0 == mp->ma_table && ep->me_key == startkey) {
329
/* The compare did major nasty stuff to the
331
* XXX A clever adversary could prevent this
332
* XXX from terminating.
334
return lookdict(mp, key, hash);
340
/* In the loop, me_key == dummy is by far (factor of 100s) the
341
least likely outcome, so test for that last. */
342
for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
343
i = (i << 2) + i + perturb + 1;
345
if (ep->me_key == NULL)
346
return freeslot == NULL ? ep : freeslot;
347
if (ep->me_key == key)
349
if (ep->me_hash == hash && ep->me_key != dummy) {
350
startkey = ep->me_key;
352
cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
356
if (ep0 == mp->ma_table && ep->me_key == startkey) {
361
/* The compare did major nasty stuff to the
363
* XXX A clever adversary could prevent this
364
* XXX from terminating.
366
return lookdict(mp, key, hash);
369
else if (ep->me_key == dummy && freeslot == NULL)
372
assert(0); /* NOT REACHED */
377
* Hacked up version of lookdict which can assume keys are always
378
* unicodes; this assumption allows testing for errors during
379
* PyObject_RichCompareBool() to be dropped; unicode-unicode
380
* comparisons never raise exceptions. This also means we don't need
381
* to go through PyObject_RichCompareBool(); we can always use
382
* unicode_eq() directly.
384
* This is valuable because dicts with only unicode keys are very common.
387
lookdict_unicode(PyDictObject *mp, PyObject *key, register long hash)
390
register size_t perturb;
391
register PyDictEntry *freeslot;
392
register size_t mask = (size_t)mp->ma_mask;
393
PyDictEntry *ep0 = mp->ma_table;
394
register PyDictEntry *ep;
396
/* Make sure this function doesn't have to handle non-unicode keys,
397
including subclasses of str; e.g., one reason to subclass
398
unicodes is to override __eq__, and for speed we don't cater to
400
if (!PyUnicode_CheckExact(key)) {
401
#ifdef SHOW_CONVERSION_COUNTS
404
mp->ma_lookup = lookdict;
405
return lookdict(mp, key, hash);
409
if (ep->me_key == NULL || ep->me_key == key)
411
if (ep->me_key == dummy)
414
if (ep->me_hash == hash && unicode_eq(ep->me_key, key))
419
/* In the loop, me_key == dummy is by far (factor of 100s) the
420
least likely outcome, so test for that last. */
421
for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
422
i = (i << 2) + i + perturb + 1;
424
if (ep->me_key == NULL)
425
return freeslot == NULL ? ep : freeslot;
426
if (ep->me_key == key
427
|| (ep->me_hash == hash
428
&& ep->me_key != dummy
429
&& unicode_eq(ep->me_key, key)))
431
if (ep->me_key == dummy && freeslot == NULL)
434
assert(0); /* NOT REACHED */
439
Internal routine to insert a new item into the table.
440
Used both by the internal resize routine and by the public insert routine.
441
Eats a reference to key and one to value.
442
Returns -1 if an error occurred, or 0 on success.
445
insertdict(register PyDictObject *mp, PyObject *key, long hash, PyObject *value)
448
register PyDictEntry *ep;
449
typedef PyDictEntry *(*lookupfunc)(PyDictObject *, PyObject *, long);
451
assert(mp->ma_lookup != NULL);
452
ep = mp->ma_lookup(mp, key, hash);
458
if (ep->me_value != NULL) {
459
old_value = ep->me_value;
460
ep->me_value = value;
461
Py_DECREF(old_value); /* which **CAN** re-enter */
465
if (ep->me_key == NULL)
468
assert(ep->me_key == dummy);
472
ep->me_hash = (Py_ssize_t)hash;
473
ep->me_value = value;
480
Internal routine used by dictresize() to insert an item which is
481
known to be absent from the dict. This routine also assumes that
482
the dict contains no deleted entries. Besides the performance benefit,
483
using insertdict() in dictresize() is dangerous (SF bug #1456209).
484
Note that no refcounts are changed by this routine; if needed, the caller
485
is responsible for incref'ing `key` and `value`.
488
insertdict_clean(register PyDictObject *mp, PyObject *key, long hash,
492
register size_t perturb;
493
register size_t mask = (size_t)mp->ma_mask;
494
PyDictEntry *ep0 = mp->ma_table;
495
register PyDictEntry *ep;
499
for (perturb = hash; ep->me_key != NULL; perturb >>= PERTURB_SHIFT) {
500
i = (i << 2) + i + perturb + 1;
503
assert(ep->me_value == NULL);
506
ep->me_hash = (Py_ssize_t)hash;
507
ep->me_value = value;
512
Restructure the table by allocating a new table and reinserting all
513
items again. When entries have been deleted, the new table may
514
actually be smaller than the old one.
517
dictresize(PyDictObject *mp, Py_ssize_t minused)
520
PyDictEntry *oldtable, *newtable, *ep;
522
int is_oldtable_malloced;
523
PyDictEntry small_copy[PyDict_MINSIZE];
525
assert(minused >= 0);
527
/* Find the smallest table size > minused. */
528
for (newsize = PyDict_MINSIZE;
529
newsize <= minused && newsize > 0;
537
/* Get space for a new table. */
538
oldtable = mp->ma_table;
539
assert(oldtable != NULL);
540
is_oldtable_malloced = oldtable != mp->ma_smalltable;
542
if (newsize == PyDict_MINSIZE) {
543
/* A large table is shrinking, or we can't get any smaller. */
544
newtable = mp->ma_smalltable;
545
if (newtable == oldtable) {
546
if (mp->ma_fill == mp->ma_used) {
547
/* No dummies, so no point doing anything. */
550
/* We're not going to resize it, but rebuild the
551
table anyway to purge old dummy entries.
552
Subtle: This is *necessary* if fill==size,
553
as lookdict needs at least one virgin slot to
554
terminate failing searches. If fill < size, it's
555
merely desirable, as dummies slow searches. */
556
assert(mp->ma_fill > mp->ma_used);
557
memcpy(small_copy, oldtable, sizeof(small_copy));
558
oldtable = small_copy;
562
newtable = PyMem_NEW(PyDictEntry, newsize);
563
if (newtable == NULL) {
569
/* Make the dict empty, using the new table. */
570
assert(newtable != oldtable);
571
mp->ma_table = newtable;
572
mp->ma_mask = newsize - 1;
573
memset(newtable, 0, sizeof(PyDictEntry) * newsize);
578
/* Copy the data over; this is refcount-neutral for active entries;
579
dummy entries aren't copied over, of course */
580
for (ep = oldtable; i > 0; ep++) {
581
if (ep->me_value != NULL) { /* active entry */
583
insertdict_clean(mp, ep->me_key, (long)ep->me_hash,
586
else if (ep->me_key != NULL) { /* dummy entry */
588
assert(ep->me_key == dummy);
589
Py_DECREF(ep->me_key);
591
/* else key == value == NULL: nothing to do */
594
if (is_oldtable_malloced)
599
/* Create a new dictionary pre-sized to hold an estimated number of elements.
600
Underestimates are okay because the dictionary will resize as necessary.
601
Overestimates just mean the dictionary will be more sparse than usual.
605
_PyDict_NewPresized(Py_ssize_t minused)
607
PyObject *op = PyDict_New();
609
if (minused>5 && op != NULL && dictresize((PyDictObject *)op, minused) == -1) {
616
/* Note that, for historical reasons, PyDict_GetItem() suppresses all errors
617
* that may occur (originally dicts supported only string keys, and exceptions
618
* weren't possible). So, while the original intent was that a NULL return
619
* meant the key wasn't present, in reality it can mean that, or that an error
620
* (suppressed) occurred while computing the key's hash, or that some error
621
* (suppressed) occurred when comparing keys in the dict's internal probe
622
* sequence. A nasty example of the latter is when a Python-coded comparison
623
* function hits a stack-depth error, which can cause this to return NULL
624
* even if the key is present.
627
PyDict_GetItem(PyObject *op, PyObject *key)
630
PyDictObject *mp = (PyDictObject *)op;
632
PyThreadState *tstate;
633
if (!PyDict_Check(op))
635
if (!PyUnicode_CheckExact(key) ||
636
(hash = ((PyUnicodeObject *) key)->hash) == -1)
638
hash = PyObject_Hash(key);
645
/* We can arrive here with a NULL tstate during initialization:
646
try running "python -Wi" for an example related to string
647
interning. Let's just hope that no exception occurs then... */
648
tstate = _PyThreadState_Current;
649
if (tstate != NULL && tstate->curexc_type != NULL) {
650
/* preserve the existing exception */
651
PyObject *err_type, *err_value, *err_tb;
652
PyErr_Fetch(&err_type, &err_value, &err_tb);
653
ep = (mp->ma_lookup)(mp, key, hash);
655
PyErr_Restore(err_type, err_value, err_tb);
660
ep = (mp->ma_lookup)(mp, key, hash);
669
/* Variant of PyDict_GetItem() that doesn't suppress exceptions.
670
This returns NULL *with* an exception set if an exception occurred.
671
It returns NULL *without* an exception set if the key wasn't present.
674
PyDict_GetItemWithError(PyObject *op, PyObject *key)
677
PyDictObject*mp = (PyDictObject *)op;
680
if (!PyDict_Check(op)) {
681
PyErr_BadInternalCall();
684
if (!PyUnicode_CheckExact(key) ||
685
(hash = ((PyUnicodeObject *) key)->hash) == -1)
687
hash = PyObject_Hash(key);
693
ep = (mp->ma_lookup)(mp, key, hash);
699
/* CAUTION: PyDict_SetItem() must guarantee that it won't resize the
700
* dictionary if it's merely replacing the value for an existing key.
701
* This means that it's safe to loop over a dictionary with PyDict_Next()
702
* and occasionally replace a value -- but you can't insert new keys or
706
PyDict_SetItem(register PyObject *op, PyObject *key, PyObject *value)
708
register PyDictObject *mp;
710
register Py_ssize_t n_used;
712
if (!PyDict_Check(op)) {
713
PyErr_BadInternalCall();
718
mp = (PyDictObject *)op;
719
if (!PyUnicode_CheckExact(key) ||
720
(hash = ((PyUnicodeObject *) key)->hash) == -1)
722
hash = PyObject_Hash(key);
726
assert(mp->ma_fill <= mp->ma_mask); /* at least one empty slot */
727
n_used = mp->ma_used;
730
if (insertdict(mp, key, hash, value) != 0)
732
/* If we added a key, we can safely resize. Otherwise just return!
733
* If fill >= 2/3 size, adjust size. Normally, this doubles or
734
* quaduples the size, but it's also possible for the dict to shrink
735
* (if ma_fill is much larger than ma_used, meaning a lot of dict
736
* keys have been * deleted).
738
* Quadrupling the size improves average dictionary sparseness
739
* (reducing collisions) at the cost of some memory and iteration
740
* speed (which loops over every possible entry). It also halves
741
* the number of expensive resize operations in a growing dictionary.
743
* Very large dictionaries (over 50K items) use doubling instead.
744
* This may help applications with severe memory constraints.
746
if (!(mp->ma_used > n_used && mp->ma_fill*3 >= (mp->ma_mask+1)*2))
748
return dictresize(mp, (mp->ma_used > 50000 ? 2 : 4) * mp->ma_used);
752
PyDict_DelItem(PyObject *op, PyObject *key)
754
register PyDictObject *mp;
756
register PyDictEntry *ep;
757
PyObject *old_value, *old_key;
759
if (!PyDict_Check(op)) {
760
PyErr_BadInternalCall();
764
if (!PyUnicode_CheckExact(key) ||
765
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
766
hash = PyObject_Hash(key);
770
mp = (PyDictObject *)op;
771
ep = (mp->ma_lookup)(mp, key, hash);
774
if (ep->me_value == NULL) {
778
old_key = ep->me_key;
781
old_value = ep->me_value;
784
Py_DECREF(old_value);
790
PyDict_Clear(PyObject *op)
793
PyDictEntry *ep, *table;
794
int table_is_malloced;
796
PyDictEntry small_copy[PyDict_MINSIZE];
801
if (!PyDict_Check(op))
803
mp = (PyDictObject *)op;
809
table = mp->ma_table;
810
assert(table != NULL);
811
table_is_malloced = table != mp->ma_smalltable;
813
/* This is delicate. During the process of clearing the dict,
814
* decrefs can cause the dict to mutate. To avoid fatal confusion
815
* (voice of experience), we have to make the dict empty before
816
* clearing the slots, and never refer to anything via mp->xxx while
820
if (table_is_malloced)
821
EMPTY_TO_MINSIZE(mp);
824
/* It's a small table with something that needs to be cleared.
825
* Afraid the only safe way is to copy the dict entries into
826
* another small table first.
828
memcpy(small_copy, table, sizeof(small_copy));
830
EMPTY_TO_MINSIZE(mp);
832
/* else it's a small table that's already empty */
834
/* Now we can finally clear things. If C had refcounts, we could
835
* assert that the refcount on table is 1 now, i.e. that this function
836
* has unique access to it, so decref side-effects can't alter it.
838
for (ep = table; fill > 0; ++ep) {
845
Py_DECREF(ep->me_key);
846
Py_XDECREF(ep->me_value);
850
assert(ep->me_value == NULL);
854
if (table_is_malloced)
859
* Iterate over a dict. Use like so:
862
* PyObject *key, *value;
863
* i = 0; # important! i should not otherwise be changed by you
864
* while (PyDict_Next(yourdict, &i, &key, &value)) {
865
* Refer to borrowed references in key and value.
868
* CAUTION: In general, it isn't safe to use PyDict_Next in a loop that
869
* mutates the dict. One exception: it is safe if the loop merely changes
870
* the values associated with the keys (but doesn't insert new keys or
871
* delete keys), via PyDict_SetItem().
874
PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue)
876
register Py_ssize_t i;
877
register Py_ssize_t mask;
878
register PyDictEntry *ep;
880
if (!PyDict_Check(op))
885
ep = ((PyDictObject *)op)->ma_table;
886
mask = ((PyDictObject *)op)->ma_mask;
887
while (i <= mask && ep[i].me_value == NULL)
893
*pkey = ep[i].me_key;
895
*pvalue = ep[i].me_value;
899
/* Internal version of PyDict_Next that returns a hash value in addition to the key and value.*/
901
_PyDict_Next(PyObject *op, Py_ssize_t *ppos, PyObject **pkey, PyObject **pvalue, long *phash)
903
register Py_ssize_t i;
904
register Py_ssize_t mask;
905
register PyDictEntry *ep;
907
if (!PyDict_Check(op))
912
ep = ((PyDictObject *)op)->ma_table;
913
mask = ((PyDictObject *)op)->ma_mask;
914
while (i <= mask && ep[i].me_value == NULL)
919
*phash = (long)(ep[i].me_hash);
921
*pkey = ep[i].me_key;
923
*pvalue = ep[i].me_value;
930
dict_dealloc(register PyDictObject *mp)
932
register PyDictEntry *ep;
933
Py_ssize_t fill = mp->ma_fill;
934
PyObject_GC_UnTrack(mp);
935
Py_TRASHCAN_SAFE_BEGIN(mp)
936
for (ep = mp->ma_table; fill > 0; ep++) {
939
Py_DECREF(ep->me_key);
940
Py_XDECREF(ep->me_value);
943
if (mp->ma_table != mp->ma_smalltable)
944
PyMem_DEL(mp->ma_table);
945
if (numfree < PyDict_MAXFREELIST && Py_TYPE(mp) == &PyDict_Type)
946
free_list[numfree++] = mp;
948
Py_TYPE(mp)->tp_free((PyObject *)mp);
949
Py_TRASHCAN_SAFE_END(mp)
953
dict_repr(PyDictObject *mp)
956
PyObject *s, *temp, *colon = NULL;
957
PyObject *pieces = NULL, *result = NULL;
958
PyObject *key, *value;
960
i = Py_ReprEnter((PyObject *)mp);
962
return i > 0 ? PyUnicode_FromString("{...}") : NULL;
965
if (mp->ma_used == 0) {
966
result = PyUnicode_FromString("{}");
970
pieces = PyList_New(0);
974
colon = PyUnicode_FromString(": ");
978
/* Do repr() on each key+value pair, and insert ": " between them.
979
Note that repr may mutate the dict. */
981
while (PyDict_Next((PyObject *)mp, &i, &key, &value)) {
983
/* Prevent repr from deleting value during key format. */
985
s = PyObject_Repr(key);
986
PyUnicode_Append(&s, colon);
987
PyUnicode_AppendAndDel(&s, PyObject_Repr(value));
991
status = PyList_Append(pieces, s);
992
Py_DECREF(s); /* append created a new ref */
997
/* Add "{}" decorations to the first and last items. */
998
assert(PyList_GET_SIZE(pieces) > 0);
999
s = PyUnicode_FromString("{");
1002
temp = PyList_GET_ITEM(pieces, 0);
1003
PyUnicode_AppendAndDel(&s, temp);
1004
PyList_SET_ITEM(pieces, 0, s);
1008
s = PyUnicode_FromString("}");
1011
temp = PyList_GET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1);
1012
PyUnicode_AppendAndDel(&temp, s);
1013
PyList_SET_ITEM(pieces, PyList_GET_SIZE(pieces) - 1, temp);
1017
/* Paste them all together with ", " between. */
1018
s = PyUnicode_FromString(", ");
1021
result = PyUnicode_Join(s, pieces);
1027
Py_ReprLeave((PyObject *)mp);
1032
dict_length(PyDictObject *mp)
1038
dict_subscript(PyDictObject *mp, register PyObject *key)
1043
assert(mp->ma_table != NULL);
1044
if (!PyUnicode_CheckExact(key) ||
1045
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
1046
hash = PyObject_Hash(key);
1050
ep = (mp->ma_lookup)(mp, key, hash);
1055
if (!PyDict_CheckExact(mp)) {
1056
/* Look up __missing__ method if we're a subclass. */
1058
static PyObject *missing_str = NULL;
1059
if (missing_str == NULL)
1061
PyUnicode_InternFromString("__missing__");
1062
missing = _PyType_Lookup(Py_TYPE(mp), missing_str);
1063
if (missing != NULL)
1064
return PyObject_CallFunctionObjArgs(missing,
1065
(PyObject *)mp, key, NULL);
1076
dict_ass_sub(PyDictObject *mp, PyObject *v, PyObject *w)
1079
return PyDict_DelItem((PyObject *)mp, v);
1081
return PyDict_SetItem((PyObject *)mp, v, w);
1084
static PyMappingMethods dict_as_mapping = {
1085
(lenfunc)dict_length, /*mp_length*/
1086
(binaryfunc)dict_subscript, /*mp_subscript*/
1087
(objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
1091
dict_keys(register PyDictObject *mp)
1093
register PyObject *v;
1094
register Py_ssize_t i, j;
1103
if (n != mp->ma_used) {
1104
/* Durnit. The allocations caused the dict to resize.
1105
* Just start over, this shouldn't normally happen.
1112
for (i = 0, j = 0; i <= mask; i++) {
1113
if (ep[i].me_value != NULL) {
1114
PyObject *key = ep[i].me_key;
1116
PyList_SET_ITEM(v, j, key);
1125
dict_values(register PyDictObject *mp)
1127
register PyObject *v;
1128
register Py_ssize_t i, j;
1137
if (n != mp->ma_used) {
1138
/* Durnit. The allocations caused the dict to resize.
1139
* Just start over, this shouldn't normally happen.
1146
for (i = 0, j = 0; i <= mask; i++) {
1147
if (ep[i].me_value != NULL) {
1148
PyObject *value = ep[i].me_value;
1150
PyList_SET_ITEM(v, j, value);
1159
dict_items(register PyDictObject *mp)
1161
register PyObject *v;
1162
register Py_ssize_t i, j, n;
1164
PyObject *item, *key, *value;
1167
/* Preallocate the list of tuples, to avoid allocations during
1168
* the loop over the items, which could trigger GC, which
1169
* could resize the dict. :-(
1176
for (i = 0; i < n; i++) {
1177
item = PyTuple_New(2);
1182
PyList_SET_ITEM(v, i, item);
1184
if (n != mp->ma_used) {
1185
/* Durnit. The allocations caused the dict to resize.
1186
* Just start over, this shouldn't normally happen.
1191
/* Nothing we do below makes any function calls. */
1194
for (i = 0, j = 0; i <= mask; i++) {
1195
if ((value=ep[i].me_value) != NULL) {
1197
item = PyList_GET_ITEM(v, j);
1199
PyTuple_SET_ITEM(item, 0, key);
1201
PyTuple_SET_ITEM(item, 1, value);
1210
dict_fromkeys(PyObject *cls, PyObject *args)
1213
PyObject *value = Py_None;
1214
PyObject *it; /* iter(seq) */
1219
if (!PyArg_UnpackTuple(args, "fromkeys", 1, 2, &seq, &value))
1222
d = PyObject_CallObject(cls, NULL);
1226
if (PyDict_CheckExact(d) && PyDict_CheckExact(seq)) {
1227
PyDictObject *mp = (PyDictObject *)d;
1233
if (dictresize(mp, Py_SIZE(seq)))
1236
while (_PyDict_Next(seq, &pos, &key, &oldvalue, &hash)) {
1239
if (insertdict(mp, key, hash, value))
1245
if (PyDict_CheckExact(d) && PyAnySet_CheckExact(seq)) {
1246
PyDictObject *mp = (PyDictObject *)d;
1251
if (dictresize(mp, PySet_GET_SIZE(seq)))
1254
while (_PySet_NextEntry(seq, &pos, &key, &hash)) {
1257
if (insertdict(mp, key, hash, value))
1263
it = PyObject_GetIter(seq);
1269
if (PyDict_CheckExact(d)) {
1270
while ((key = PyIter_Next(it)) != NULL) {
1271
status = PyDict_SetItem(d, key, value);
1277
while ((key = PyIter_Next(it)) != NULL) {
1278
status = PyObject_SetItem(d, key, value);
1285
if (PyErr_Occurred())
1297
dict_update_common(PyObject *self, PyObject *args, PyObject *kwds, char *methname)
1299
PyObject *arg = NULL;
1302
if (!PyArg_UnpackTuple(args, methname, 0, 1, &arg))
1305
else if (arg != NULL) {
1306
if (PyObject_HasAttrString(arg, "keys"))
1307
result = PyDict_Merge(self, arg, 1);
1309
result = PyDict_MergeFromSeq2(self, arg, 1);
1311
if (result == 0 && kwds != NULL)
1312
result = PyDict_Merge(self, kwds, 1);
1317
dict_update(PyObject *self, PyObject *args, PyObject *kwds)
1319
if (dict_update_common(self, args, kwds, "update") != -1)
1324
/* Update unconditionally replaces existing items.
1325
Merge has a 3rd argument 'override'; if set, it acts like Update,
1326
otherwise it leaves existing items unchanged.
1328
PyDict_{Update,Merge} update/merge from a mapping object.
1330
PyDict_MergeFromSeq2 updates/merges from any iterable object
1331
producing iterable objects of length 2.
1335
PyDict_MergeFromSeq2(PyObject *d, PyObject *seq2, int override)
1337
PyObject *it; /* iter(seq2) */
1338
Py_ssize_t i; /* index into seq2 of current element */
1339
PyObject *item; /* seq2[i] */
1340
PyObject *fast; /* item as a 2-tuple or 2-list */
1343
assert(PyDict_Check(d));
1344
assert(seq2 != NULL);
1346
it = PyObject_GetIter(seq2);
1350
for (i = 0; ; ++i) {
1351
PyObject *key, *value;
1355
item = PyIter_Next(it);
1357
if (PyErr_Occurred())
1362
/* Convert item to sequence, and verify length 2. */
1363
fast = PySequence_Fast(item, "");
1365
if (PyErr_ExceptionMatches(PyExc_TypeError))
1366
PyErr_Format(PyExc_TypeError,
1367
"cannot convert dictionary update "
1368
"sequence element #%zd to a sequence",
1372
n = PySequence_Fast_GET_SIZE(fast);
1374
PyErr_Format(PyExc_ValueError,
1375
"dictionary update sequence element #%zd "
1376
"has length %zd; 2 is required",
1381
/* Update/merge with this (key, value) pair. */
1382
key = PySequence_Fast_GET_ITEM(fast, 0);
1383
value = PySequence_Fast_GET_ITEM(fast, 1);
1384
if (override || PyDict_GetItem(d, key) == NULL) {
1385
int status = PyDict_SetItem(d, key, value);
1401
return Py_SAFE_DOWNCAST(i, Py_ssize_t, int);
1405
PyDict_Update(PyObject *a, PyObject *b)
1407
return PyDict_Merge(a, b, 1);
1411
PyDict_Merge(PyObject *a, PyObject *b, int override)
1413
register PyDictObject *mp, *other;
1414
register Py_ssize_t i;
1417
/* We accept for the argument either a concrete dictionary object,
1418
* or an abstract "mapping" object. For the former, we can do
1419
* things quite efficiently. For the latter, we only require that
1420
* PyMapping_Keys() and PyObject_GetItem() be supported.
1422
if (a == NULL || !PyDict_Check(a) || b == NULL) {
1423
PyErr_BadInternalCall();
1426
mp = (PyDictObject*)a;
1427
if (PyDict_Check(b)) {
1428
other = (PyDictObject*)b;
1429
if (other == mp || other->ma_used == 0)
1430
/* a.update(a) or a.update({}); nothing to do */
1432
if (mp->ma_used == 0)
1433
/* Since the target dict is empty, PyDict_GetItem()
1434
* always returns NULL. Setting override to 1
1435
* skips the unnecessary test.
1438
/* Do one big resize at the start, rather than
1439
* incrementally resizing as we insert new items. Expect
1440
* that there will be no (or few) overlapping keys.
1442
if ((mp->ma_fill + other->ma_used)*3 >= (mp->ma_mask+1)*2) {
1443
if (dictresize(mp, (mp->ma_used + other->ma_used)*2) != 0)
1446
for (i = 0; i <= other->ma_mask; i++) {
1447
entry = &other->ma_table[i];
1448
if (entry->me_value != NULL &&
1450
PyDict_GetItem(a, entry->me_key) == NULL)) {
1451
Py_INCREF(entry->me_key);
1452
Py_INCREF(entry->me_value);
1453
if (insertdict(mp, entry->me_key,
1454
(long)entry->me_hash,
1455
entry->me_value) != 0)
1461
/* Do it the generic, slower way */
1462
PyObject *keys = PyMapping_Keys(b);
1464
PyObject *key, *value;
1468
/* Docstring says this is equivalent to E.keys() so
1469
* if E doesn't have a .keys() method we want
1470
* AttributeError to percolate up. Might as well
1471
* do the same for any other error.
1475
iter = PyObject_GetIter(keys);
1480
for (key = PyIter_Next(iter); key; key = PyIter_Next(iter)) {
1481
if (!override && PyDict_GetItem(a, key) != NULL) {
1485
value = PyObject_GetItem(b, key);
1486
if (value == NULL) {
1491
status = PyDict_SetItem(a, key, value);
1500
if (PyErr_Occurred())
1501
/* Iterator completed, via error */
1508
dict_copy(register PyDictObject *mp)
1510
return PyDict_Copy((PyObject*)mp);
1514
PyDict_Copy(PyObject *o)
1518
if (o == NULL || !PyDict_Check(o)) {
1519
PyErr_BadInternalCall();
1522
copy = PyDict_New();
1525
if (PyDict_Merge(copy, o, 1) == 0)
1532
PyDict_Size(PyObject *mp)
1534
if (mp == NULL || !PyDict_Check(mp)) {
1535
PyErr_BadInternalCall();
1538
return ((PyDictObject *)mp)->ma_used;
1542
PyDict_Keys(PyObject *mp)
1544
if (mp == NULL || !PyDict_Check(mp)) {
1545
PyErr_BadInternalCall();
1548
return dict_keys((PyDictObject *)mp);
1552
PyDict_Values(PyObject *mp)
1554
if (mp == NULL || !PyDict_Check(mp)) {
1555
PyErr_BadInternalCall();
1558
return dict_values((PyDictObject *)mp);
1562
PyDict_Items(PyObject *mp)
1564
if (mp == NULL || !PyDict_Check(mp)) {
1565
PyErr_BadInternalCall();
1568
return dict_items((PyDictObject *)mp);
1571
/* Return 1 if dicts equal, 0 if not, -1 if error.
1572
* Gets out as soon as any difference is detected.
1573
* Uses only Py_EQ comparison.
1576
dict_equal(PyDictObject *a, PyDictObject *b)
1580
if (a->ma_used != b->ma_used)
1581
/* can't be equal if # of entries differ */
1584
/* Same # of entries -- check all of 'em. Exit early on any diff. */
1585
for (i = 0; i <= a->ma_mask; i++) {
1586
PyObject *aval = a->ma_table[i].me_value;
1590
PyObject *key = a->ma_table[i].me_key;
1591
/* temporarily bump aval's refcount to ensure it stays
1592
alive until we're done with it */
1596
bval = PyDict_GetItemWithError((PyObject *)b, key);
1600
if (PyErr_Occurred())
1604
cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);
1606
if (cmp <= 0) /* error or not equal */
1614
dict_richcompare(PyObject *v, PyObject *w, int op)
1619
if (!PyDict_Check(v) || !PyDict_Check(w)) {
1620
res = Py_NotImplemented;
1622
else if (op == Py_EQ || op == Py_NE) {
1623
cmp = dict_equal((PyDictObject *)v, (PyDictObject *)w);
1626
res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
1629
res = Py_NotImplemented;
1635
dict_contains(register PyDictObject *mp, PyObject *key)
1640
if (!PyUnicode_CheckExact(key) ||
1641
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
1642
hash = PyObject_Hash(key);
1646
ep = (mp->ma_lookup)(mp, key, hash);
1649
return PyBool_FromLong(ep->me_value != NULL);
1653
dict_get(register PyDictObject *mp, PyObject *args)
1656
PyObject *failobj = Py_None;
1657
PyObject *val = NULL;
1661
if (!PyArg_UnpackTuple(args, "get", 1, 2, &key, &failobj))
1664
if (!PyUnicode_CheckExact(key) ||
1665
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
1666
hash = PyObject_Hash(key);
1670
ep = (mp->ma_lookup)(mp, key, hash);
1682
dict_setdefault(register PyDictObject *mp, PyObject *args)
1685
PyObject *failobj = Py_None;
1686
PyObject *val = NULL;
1690
if (!PyArg_UnpackTuple(args, "setdefault", 1, 2, &key, &failobj))
1693
if (!PyUnicode_CheckExact(key) ||
1694
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
1695
hash = PyObject_Hash(key);
1699
ep = (mp->ma_lookup)(mp, key, hash);
1705
if (PyDict_SetItem((PyObject*)mp, key, failobj))
1714
dict_clear(register PyDictObject *mp)
1716
PyDict_Clear((PyObject *)mp);
1721
dict_pop(PyDictObject *mp, PyObject *args)
1725
PyObject *old_value, *old_key;
1726
PyObject *key, *deflt = NULL;
1728
if(!PyArg_UnpackTuple(args, "pop", 1, 2, &key, &deflt))
1730
if (mp->ma_used == 0) {
1735
PyErr_SetString(PyExc_KeyError,
1736
"pop(): dictionary is empty");
1739
if (!PyUnicode_CheckExact(key) ||
1740
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
1741
hash = PyObject_Hash(key);
1745
ep = (mp->ma_lookup)(mp, key, hash);
1748
if (ep->me_value == NULL) {
1756
old_key = ep->me_key;
1759
old_value = ep->me_value;
1760
ep->me_value = NULL;
1767
dict_popitem(PyDictObject *mp)
1773
/* Allocate the result tuple before checking the size. Believe it
1774
* or not, this allocation could trigger a garbage collection which
1775
* could empty the dict, so if we checked the size first and that
1776
* happened, the result would be an infinite loop (searching for an
1777
* entry that no longer exists). Note that the usual popitem()
1778
* idiom is "while d: k, v = d.popitem()". so needing to throw the
1779
* tuple away if the dict *is* empty isn't a significant
1780
* inefficiency -- possible, but unlikely in practice.
1782
res = PyTuple_New(2);
1785
if (mp->ma_used == 0) {
1787
PyErr_SetString(PyExc_KeyError,
1788
"popitem(): dictionary is empty");
1791
/* Set ep to "the first" dict entry with a value. We abuse the hash
1792
* field of slot 0 to hold a search finger:
1793
* If slot 0 has a value, use slot 0.
1794
* Else slot 0 is being used to hold a search finger,
1795
* and we use its hash value as the first index to look.
1797
ep = &mp->ma_table[0];
1798
if (ep->me_value == NULL) {
1800
/* The hash field may be a real hash value, or it may be a
1801
* legit search finger, or it may be a once-legit search
1802
* finger that's out of bounds now because it wrapped around
1803
* or the table shrunk -- simply make sure it's in bounds now.
1805
if (i > mp->ma_mask || i < 1)
1806
i = 1; /* skip slot 0 */
1807
while ((ep = &mp->ma_table[i])->me_value == NULL) {
1809
if (i > mp->ma_mask)
1813
PyTuple_SET_ITEM(res, 0, ep->me_key);
1814
PyTuple_SET_ITEM(res, 1, ep->me_value);
1817
ep->me_value = NULL;
1819
assert(mp->ma_table[0].me_value == NULL);
1820
mp->ma_table[0].me_hash = i + 1; /* next place to start */
1825
dict_traverse(PyObject *op, visitproc visit, void *arg)
1831
while (PyDict_Next(op, &i, &pk, &pv)) {
1839
dict_tp_clear(PyObject *op)
1845
static PyObject *dictiter_new(PyDictObject *, PyTypeObject *);
1848
dict_sizeof(PyDictObject *mp)
1852
res = sizeof(PyDictObject);
1853
if (mp->ma_table != mp->ma_smalltable)
1854
res = res + (mp->ma_mask + 1) * sizeof(PyDictEntry);
1855
return PyLong_FromSsize_t(res);
1858
PyDoc_STRVAR(contains__doc__,
1859
"D.__contains__(k) -> True if D has a key k, else False");
1861
PyDoc_STRVAR(getitem__doc__, "x.__getitem__(y) <==> x[y]");
1863
PyDoc_STRVAR(sizeof__doc__,
1864
"D.__sizeof__() -> size of D in memory, in bytes");
1866
PyDoc_STRVAR(get__doc__,
1867
"D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.");
1869
PyDoc_STRVAR(setdefault_doc__,
1870
"D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D");
1872
PyDoc_STRVAR(pop__doc__,
1873
"D.pop(k[,d]) -> v, remove specified key and return the corresponding value.\n\
1874
If key is not found, d is returned if given, otherwise KeyError is raised");
1876
PyDoc_STRVAR(popitem__doc__,
1877
"D.popitem() -> (k, v), remove and return some (key, value) pair as a\n\
1878
2-tuple; but raise KeyError if D is empty.");
1880
PyDoc_STRVAR(update__doc__,
1881
"D.update(E, **F) -> None. Update D from dict/iterable E and F.\n"
1882
"If E has a .keys() method, does: for k in E: D[k] = E[k]\n\
1883
If E lacks .keys() method, does: for (k, v) in E: D[k] = v\n\
1884
In either case, this is followed by: for k in F: D[k] = F[k]");
1886
PyDoc_STRVAR(fromkeys__doc__,
1887
"dict.fromkeys(S[,v]) -> New dict with keys from S and values equal to v.\n\
1888
v defaults to None.");
1890
PyDoc_STRVAR(clear__doc__,
1891
"D.clear() -> None. Remove all items from D.");
1893
PyDoc_STRVAR(copy__doc__,
1894
"D.copy() -> a shallow copy of D");
1897
static PyObject *dictkeys_new(PyObject *);
1898
static PyObject *dictitems_new(PyObject *);
1899
static PyObject *dictvalues_new(PyObject *);
1901
PyDoc_STRVAR(keys__doc__,
1902
"D.keys() -> a set-like object providing a view on D's keys");
1903
PyDoc_STRVAR(items__doc__,
1904
"D.items() -> a set-like object providing a view on D's items");
1905
PyDoc_STRVAR(values__doc__,
1906
"D.values() -> an object providing a view on D's values");
1908
static PyMethodDef mapp_methods[] = {
1909
{"__contains__",(PyCFunction)dict_contains, METH_O | METH_COEXIST,
1911
{"__getitem__", (PyCFunction)dict_subscript, METH_O | METH_COEXIST,
1913
{"__sizeof__", (PyCFunction)dict_sizeof, METH_NOARGS,
1915
{"get", (PyCFunction)dict_get, METH_VARARGS,
1917
{"setdefault", (PyCFunction)dict_setdefault, METH_VARARGS,
1919
{"pop", (PyCFunction)dict_pop, METH_VARARGS,
1921
{"popitem", (PyCFunction)dict_popitem, METH_NOARGS,
1923
{"keys", (PyCFunction)dictkeys_new, METH_NOARGS,
1925
{"items", (PyCFunction)dictitems_new, METH_NOARGS,
1927
{"values", (PyCFunction)dictvalues_new, METH_NOARGS,
1929
{"update", (PyCFunction)dict_update, METH_VARARGS | METH_KEYWORDS,
1931
{"fromkeys", (PyCFunction)dict_fromkeys, METH_VARARGS | METH_CLASS,
1933
{"clear", (PyCFunction)dict_clear, METH_NOARGS,
1935
{"copy", (PyCFunction)dict_copy, METH_NOARGS,
1937
{NULL, NULL} /* sentinel */
1940
/* Return 1 if `key` is in dict `op`, 0 if not, and -1 on error. */
1942
PyDict_Contains(PyObject *op, PyObject *key)
1945
PyDictObject *mp = (PyDictObject *)op;
1948
if (!PyUnicode_CheckExact(key) ||
1949
(hash = ((PyUnicodeObject *) key)->hash) == -1) {
1950
hash = PyObject_Hash(key);
1954
ep = (mp->ma_lookup)(mp, key, hash);
1955
return ep == NULL ? -1 : (ep->me_value != NULL);
1958
/* Internal version of PyDict_Contains used when the hash value is already known */
1960
_PyDict_Contains(PyObject *op, PyObject *key, long hash)
1962
PyDictObject *mp = (PyDictObject *)op;
1965
ep = (mp->ma_lookup)(mp, key, hash);
1966
return ep == NULL ? -1 : (ep->me_value != NULL);
1969
/* Hack to implement "key in dict" */
1970
static PySequenceMethods dict_as_sequence = {
1976
0, /* sq_ass_item */
1977
0, /* sq_ass_slice */
1978
PyDict_Contains, /* sq_contains */
1979
0, /* sq_inplace_concat */
1980
0, /* sq_inplace_repeat */
1984
dict_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
1988
assert(type != NULL && type->tp_alloc != NULL);
1989
self = type->tp_alloc(type, 0);
1991
PyDictObject *d = (PyDictObject *)self;
1992
/* It's guaranteed that tp->alloc zeroed out the struct. */
1993
assert(d->ma_table == NULL && d->ma_fill == 0 && d->ma_used == 0);
1994
INIT_NONZERO_DICT_SLOTS(d);
1995
d->ma_lookup = lookdict_unicode;
1996
#ifdef SHOW_CONVERSION_COUNTS
2004
dict_init(PyObject *self, PyObject *args, PyObject *kwds)
2006
return dict_update_common(self, args, kwds, "dict");
2010
dict_iter(PyDictObject *dict)
2012
return dictiter_new(dict, &PyDictIterKey_Type);
2015
PyDoc_STRVAR(dictionary_doc,
2016
"dict() -> new empty dictionary.\n"
2017
"dict(mapping) -> new dictionary initialized from a mapping object's\n"
2018
" (key, value) pairs.\n"
2019
"dict(seq) -> new dictionary initialized as if via:\n"
2021
" for k, v in seq:\n"
2023
"dict(**kwargs) -> new dictionary initialized with the name=value pairs\n"
2024
" in the keyword argument list. For example: dict(one=1, two=2)");
2026
PyTypeObject PyDict_Type = {
2027
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2029
sizeof(PyDictObject),
2031
(destructor)dict_dealloc, /* tp_dealloc */
2035
0, /* tp_reserved */
2036
(reprfunc)dict_repr, /* tp_repr */
2037
0, /* tp_as_number */
2038
&dict_as_sequence, /* tp_as_sequence */
2039
&dict_as_mapping, /* tp_as_mapping */
2040
(hashfunc)PyObject_HashNotImplemented, /* tp_hash */
2043
PyObject_GenericGetAttr, /* tp_getattro */
2044
0, /* tp_setattro */
2045
0, /* tp_as_buffer */
2046
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC |
2047
Py_TPFLAGS_BASETYPE | Py_TPFLAGS_DICT_SUBCLASS, /* tp_flags */
2048
dictionary_doc, /* tp_doc */
2049
dict_traverse, /* tp_traverse */
2050
dict_tp_clear, /* tp_clear */
2051
dict_richcompare, /* tp_richcompare */
2052
0, /* tp_weaklistoffset */
2053
(getiterfunc)dict_iter, /* tp_iter */
2054
0, /* tp_iternext */
2055
mapp_methods, /* tp_methods */
2060
0, /* tp_descr_get */
2061
0, /* tp_descr_set */
2062
0, /* tp_dictoffset */
2063
dict_init, /* tp_init */
2064
PyType_GenericAlloc, /* tp_alloc */
2065
dict_new, /* tp_new */
2066
PyObject_GC_Del, /* tp_free */
2069
/* For backward compatibility with old dictionary interface */
2072
PyDict_GetItemString(PyObject *v, const char *key)
2075
kv = PyUnicode_FromString(key);
2078
rv = PyDict_GetItem(v, kv);
2084
PyDict_SetItemString(PyObject *v, const char *key, PyObject *item)
2088
kv = PyUnicode_FromString(key);
2091
PyUnicode_InternInPlace(&kv); /* XXX Should we really? */
2092
err = PyDict_SetItem(v, kv, item);
2098
PyDict_DelItemString(PyObject *v, const char *key)
2102
kv = PyUnicode_FromString(key);
2105
err = PyDict_DelItem(v, kv);
2110
/* Dictionary iterator types */
2114
PyDictObject *di_dict; /* Set to NULL when iterator is exhausted */
2117
PyObject* di_result; /* reusable result tuple for iteritems */
2122
dictiter_new(PyDictObject *dict, PyTypeObject *itertype)
2125
di = PyObject_GC_New(dictiterobject, itertype);
2130
di->di_used = dict->ma_used;
2132
di->len = dict->ma_used;
2133
if (itertype == &PyDictIterItem_Type) {
2134
di->di_result = PyTuple_Pack(2, Py_None, Py_None);
2135
if (di->di_result == NULL) {
2141
di->di_result = NULL;
2142
_PyObject_GC_TRACK(di);
2143
return (PyObject *)di;
2147
dictiter_dealloc(dictiterobject *di)
2149
Py_XDECREF(di->di_dict);
2150
Py_XDECREF(di->di_result);
2151
PyObject_GC_Del(di);
2155
dictiter_traverse(dictiterobject *di, visitproc visit, void *arg)
2157
Py_VISIT(di->di_dict);
2158
Py_VISIT(di->di_result);
2163
dictiter_len(dictiterobject *di)
2166
if (di->di_dict != NULL && di->di_used == di->di_dict->ma_used)
2168
return PyLong_FromSize_t(len);
2171
PyDoc_STRVAR(length_hint_doc,
2172
"Private method returning an estimate of len(list(it)).");
2174
static PyMethodDef dictiter_methods[] = {
2175
{"__length_hint__", (PyCFunction)dictiter_len, METH_NOARGS,
2177
{NULL, NULL} /* sentinel */
2180
static PyObject *dictiter_iternextkey(dictiterobject *di)
2183
register Py_ssize_t i, mask;
2184
register PyDictEntry *ep;
2185
PyDictObject *d = di->di_dict;
2189
assert (PyDict_Check(d));
2191
if (di->di_used != d->ma_used) {
2192
PyErr_SetString(PyExc_RuntimeError,
2193
"dictionary changed size during iteration");
2194
di->di_used = -1; /* Make this state sticky */
2203
while (i <= mask && ep[i].me_value == NULL)
2219
PyTypeObject PyDictIterKey_Type = {
2220
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2221
"dict_keyiterator", /* tp_name */
2222
sizeof(dictiterobject), /* tp_basicsize */
2223
0, /* tp_itemsize */
2225
(destructor)dictiter_dealloc, /* tp_dealloc */
2229
0, /* tp_reserved */
2231
0, /* tp_as_number */
2232
0, /* tp_as_sequence */
2233
0, /* tp_as_mapping */
2237
PyObject_GenericGetAttr, /* tp_getattro */
2238
0, /* tp_setattro */
2239
0, /* tp_as_buffer */
2240
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2242
(traverseproc)dictiter_traverse, /* tp_traverse */
2244
0, /* tp_richcompare */
2245
0, /* tp_weaklistoffset */
2246
PyObject_SelfIter, /* tp_iter */
2247
(iternextfunc)dictiter_iternextkey, /* tp_iternext */
2248
dictiter_methods, /* tp_methods */
2252
static PyObject *dictiter_iternextvalue(dictiterobject *di)
2255
register Py_ssize_t i, mask;
2256
register PyDictEntry *ep;
2257
PyDictObject *d = di->di_dict;
2261
assert (PyDict_Check(d));
2263
if (di->di_used != d->ma_used) {
2264
PyErr_SetString(PyExc_RuntimeError,
2265
"dictionary changed size during iteration");
2266
di->di_used = -1; /* Make this state sticky */
2272
if (i < 0 || i > mask)
2275
while ((value=ep[i].me_value) == NULL) {
2291
PyTypeObject PyDictIterValue_Type = {
2292
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2293
"dict_valueiterator", /* tp_name */
2294
sizeof(dictiterobject), /* tp_basicsize */
2295
0, /* tp_itemsize */
2297
(destructor)dictiter_dealloc, /* tp_dealloc */
2301
0, /* tp_reserved */
2303
0, /* tp_as_number */
2304
0, /* tp_as_sequence */
2305
0, /* tp_as_mapping */
2309
PyObject_GenericGetAttr, /* tp_getattro */
2310
0, /* tp_setattro */
2311
0, /* tp_as_buffer */
2312
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2314
(traverseproc)dictiter_traverse, /* tp_traverse */
2316
0, /* tp_richcompare */
2317
0, /* tp_weaklistoffset */
2318
PyObject_SelfIter, /* tp_iter */
2319
(iternextfunc)dictiter_iternextvalue, /* tp_iternext */
2320
dictiter_methods, /* tp_methods */
2324
static PyObject *dictiter_iternextitem(dictiterobject *di)
2326
PyObject *key, *value, *result = di->di_result;
2327
register Py_ssize_t i, mask;
2328
register PyDictEntry *ep;
2329
PyDictObject *d = di->di_dict;
2333
assert (PyDict_Check(d));
2335
if (di->di_used != d->ma_used) {
2336
PyErr_SetString(PyExc_RuntimeError,
2337
"dictionary changed size during iteration");
2338
di->di_used = -1; /* Make this state sticky */
2347
while (i <= mask && ep[i].me_value == NULL)
2353
if (result->ob_refcnt == 1) {
2355
Py_DECREF(PyTuple_GET_ITEM(result, 0));
2356
Py_DECREF(PyTuple_GET_ITEM(result, 1));
2358
result = PyTuple_New(2);
2364
value = ep[i].me_value;
2367
PyTuple_SET_ITEM(result, 0, key);
2368
PyTuple_SET_ITEM(result, 1, value);
2377
PyTypeObject PyDictIterItem_Type = {
2378
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2379
"dict_itemiterator", /* tp_name */
2380
sizeof(dictiterobject), /* tp_basicsize */
2381
0, /* tp_itemsize */
2383
(destructor)dictiter_dealloc, /* tp_dealloc */
2387
0, /* tp_reserved */
2389
0, /* tp_as_number */
2390
0, /* tp_as_sequence */
2391
0, /* tp_as_mapping */
2395
PyObject_GenericGetAttr, /* tp_getattro */
2396
0, /* tp_setattro */
2397
0, /* tp_as_buffer */
2398
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2400
(traverseproc)dictiter_traverse, /* tp_traverse */
2402
0, /* tp_richcompare */
2403
0, /* tp_weaklistoffset */
2404
PyObject_SelfIter, /* tp_iter */
2405
(iternextfunc)dictiter_iternextitem, /* tp_iternext */
2406
dictiter_methods, /* tp_methods */
2411
/***********************************************/
2412
/* View objects for keys(), items(), values(). */
2413
/***********************************************/
2415
/* The instance lay-out is the same for all three; but the type differs. */
2419
PyDictObject *dv_dict;
2424
dictview_dealloc(dictviewobject *dv)
2426
Py_XDECREF(dv->dv_dict);
2427
PyObject_GC_Del(dv);
2431
dictview_traverse(dictviewobject *dv, visitproc visit, void *arg)
2433
Py_VISIT(dv->dv_dict);
2438
dictview_len(dictviewobject *dv)
2441
if (dv->dv_dict != NULL)
2442
len = dv->dv_dict->ma_used;
2447
dictview_new(PyObject *dict, PyTypeObject *type)
2451
PyErr_BadInternalCall();
2454
if (!PyDict_Check(dict)) {
2455
/* XXX Get rid of this restriction later */
2456
PyErr_Format(PyExc_TypeError,
2457
"%s() requires a dict argument, not '%s'",
2458
type->tp_name, dict->ob_type->tp_name);
2461
dv = PyObject_GC_New(dictviewobject, type);
2465
dv->dv_dict = (PyDictObject *)dict;
2466
_PyObject_GC_TRACK(dv);
2467
return (PyObject *)dv;
2470
/* TODO(guido): The views objects are not complete:
2472
* support more set operations
2473
* support arbitrary mappings?
2474
- either these should be static or exported in dictobject.h
2475
- if public then they should probably be in builtins
2478
/* Return 1 if self is a subset of other, iterating over self;
2479
0 if not; -1 if an error occurred. */
2481
all_contained_in(PyObject *self, PyObject *other)
2483
PyObject *iter = PyObject_GetIter(self);
2489
PyObject *next = PyIter_Next(iter);
2491
if (PyErr_Occurred())
2495
ok = PySequence_Contains(other, next);
2505
dictview_richcompare(PyObject *self, PyObject *other, int op)
2507
Py_ssize_t len_self, len_other;
2511
assert(self != NULL);
2512
assert(PyDictViewSet_Check(self));
2513
assert(other != NULL);
2515
if (!PyAnySet_Check(other) && !PyDictViewSet_Check(other)) {
2516
Py_INCREF(Py_NotImplemented);
2517
return Py_NotImplemented;
2520
len_self = PyObject_Size(self);
2523
len_other = PyObject_Size(other);
2532
if (len_self == len_other)
2533
ok = all_contained_in(self, other);
2534
if (op == Py_NE && ok >= 0)
2539
if (len_self < len_other)
2540
ok = all_contained_in(self, other);
2544
if (len_self <= len_other)
2545
ok = all_contained_in(self, other);
2549
if (len_self > len_other)
2550
ok = all_contained_in(other, self);
2554
if (len_self >= len_other)
2555
ok = all_contained_in(other, self);
2561
result = ok ? Py_True : Py_False;
2567
dictview_repr(dictviewobject *dv)
2572
seq = PySequence_List((PyObject *)dv);
2576
result = PyUnicode_FromFormat("%s(%R)", Py_TYPE(dv)->tp_name, seq);
2584
dictkeys_iter(dictviewobject *dv)
2586
if (dv->dv_dict == NULL) {
2589
return dictiter_new(dv->dv_dict, &PyDictIterKey_Type);
2593
dictkeys_contains(dictviewobject *dv, PyObject *obj)
2595
if (dv->dv_dict == NULL)
2597
return PyDict_Contains((PyObject *)dv->dv_dict, obj);
2600
static PySequenceMethods dictkeys_as_sequence = {
2601
(lenfunc)dictview_len, /* sq_length */
2606
0, /* sq_ass_item */
2607
0, /* sq_ass_slice */
2608
(objobjproc)dictkeys_contains, /* sq_contains */
2612
dictviews_sub(PyObject* self, PyObject *other)
2614
PyObject *result = PySet_New(self);
2619
tmp = PyObject_CallMethod(result, "difference_update", "O", other);
2630
dictviews_and(PyObject* self, PyObject *other)
2632
PyObject *result = PySet_New(self);
2637
tmp = PyObject_CallMethod(result, "intersection_update", "O", other);
2648
dictviews_or(PyObject* self, PyObject *other)
2650
PyObject *result = PySet_New(self);
2655
tmp = PyObject_CallMethod(result, "update", "O", other);
2666
dictviews_xor(PyObject* self, PyObject *other)
2668
PyObject *result = PySet_New(self);
2673
tmp = PyObject_CallMethod(result, "symmetric_difference_update", "O",
2684
static PyNumberMethods dictviews_as_number = {
2686
(binaryfunc)dictviews_sub, /*nb_subtract*/
2698
(binaryfunc)dictviews_and, /*nb_and*/
2699
(binaryfunc)dictviews_xor, /*nb_xor*/
2700
(binaryfunc)dictviews_or, /*nb_or*/
2703
static PyMethodDef dictkeys_methods[] = {
2704
{NULL, NULL} /* sentinel */
2707
PyTypeObject PyDictKeys_Type = {
2708
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2709
"dict_keys", /* tp_name */
2710
sizeof(dictviewobject), /* tp_basicsize */
2711
0, /* tp_itemsize */
2713
(destructor)dictview_dealloc, /* tp_dealloc */
2717
0, /* tp_reserved */
2718
(reprfunc)dictview_repr, /* tp_repr */
2719
&dictviews_as_number, /* tp_as_number */
2720
&dictkeys_as_sequence, /* tp_as_sequence */
2721
0, /* tp_as_mapping */
2725
PyObject_GenericGetAttr, /* tp_getattro */
2726
0, /* tp_setattro */
2727
0, /* tp_as_buffer */
2728
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2730
(traverseproc)dictview_traverse, /* tp_traverse */
2732
dictview_richcompare, /* tp_richcompare */
2733
0, /* tp_weaklistoffset */
2734
(getiterfunc)dictkeys_iter, /* tp_iter */
2735
0, /* tp_iternext */
2736
dictkeys_methods, /* tp_methods */
2741
dictkeys_new(PyObject *dict)
2743
return dictview_new(dict, &PyDictKeys_Type);
2746
/*** dict_items ***/
2749
dictitems_iter(dictviewobject *dv)
2751
if (dv->dv_dict == NULL) {
2754
return dictiter_new(dv->dv_dict, &PyDictIterItem_Type);
2758
dictitems_contains(dictviewobject *dv, PyObject *obj)
2760
PyObject *key, *value, *found;
2761
if (dv->dv_dict == NULL)
2763
if (!PyTuple_Check(obj) || PyTuple_GET_SIZE(obj) != 2)
2765
key = PyTuple_GET_ITEM(obj, 0);
2766
value = PyTuple_GET_ITEM(obj, 1);
2767
found = PyDict_GetItem((PyObject *)dv->dv_dict, key);
2768
if (found == NULL) {
2769
if (PyErr_Occurred())
2773
return PyObject_RichCompareBool(value, found, Py_EQ);
2776
static PySequenceMethods dictitems_as_sequence = {
2777
(lenfunc)dictview_len, /* sq_length */
2782
0, /* sq_ass_item */
2783
0, /* sq_ass_slice */
2784
(objobjproc)dictitems_contains, /* sq_contains */
2787
static PyMethodDef dictitems_methods[] = {
2788
{NULL, NULL} /* sentinel */
2791
PyTypeObject PyDictItems_Type = {
2792
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2793
"dict_items", /* tp_name */
2794
sizeof(dictviewobject), /* tp_basicsize */
2795
0, /* tp_itemsize */
2797
(destructor)dictview_dealloc, /* tp_dealloc */
2801
0, /* tp_reserved */
2802
(reprfunc)dictview_repr, /* tp_repr */
2803
&dictviews_as_number, /* tp_as_number */
2804
&dictitems_as_sequence, /* tp_as_sequence */
2805
0, /* tp_as_mapping */
2809
PyObject_GenericGetAttr, /* tp_getattro */
2810
0, /* tp_setattro */
2811
0, /* tp_as_buffer */
2812
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2814
(traverseproc)dictview_traverse, /* tp_traverse */
2816
dictview_richcompare, /* tp_richcompare */
2817
0, /* tp_weaklistoffset */
2818
(getiterfunc)dictitems_iter, /* tp_iter */
2819
0, /* tp_iternext */
2820
dictitems_methods, /* tp_methods */
2825
dictitems_new(PyObject *dict)
2827
return dictview_new(dict, &PyDictItems_Type);
2830
/*** dict_values ***/
2833
dictvalues_iter(dictviewobject *dv)
2835
if (dv->dv_dict == NULL) {
2838
return dictiter_new(dv->dv_dict, &PyDictIterValue_Type);
2841
static PySequenceMethods dictvalues_as_sequence = {
2842
(lenfunc)dictview_len, /* sq_length */
2847
0, /* sq_ass_item */
2848
0, /* sq_ass_slice */
2849
(objobjproc)0, /* sq_contains */
2852
static PyMethodDef dictvalues_methods[] = {
2853
{NULL, NULL} /* sentinel */
2856
PyTypeObject PyDictValues_Type = {
2857
PyVarObject_HEAD_INIT(&PyType_Type, 0)
2858
"dict_values", /* tp_name */
2859
sizeof(dictviewobject), /* tp_basicsize */
2860
0, /* tp_itemsize */
2862
(destructor)dictview_dealloc, /* tp_dealloc */
2866
0, /* tp_reserved */
2867
(reprfunc)dictview_repr, /* tp_repr */
2868
0, /* tp_as_number */
2869
&dictvalues_as_sequence, /* tp_as_sequence */
2870
0, /* tp_as_mapping */
2874
PyObject_GenericGetAttr, /* tp_getattro */
2875
0, /* tp_setattro */
2876
0, /* tp_as_buffer */
2877
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC,/* tp_flags */
2879
(traverseproc)dictview_traverse, /* tp_traverse */
2881
0, /* tp_richcompare */
2882
0, /* tp_weaklistoffset */
2883
(getiterfunc)dictvalues_iter, /* tp_iter */
2884
0, /* tp_iternext */
2885
dictvalues_methods, /* tp_methods */
2890
dictvalues_new(PyObject *dict)
2892
return dictview_new(dict, &PyDictValues_Type);