4
These mostly parallel the data structures in the gnc engine.
5
See table-create.sql for more info.
7
General Theory of Operation
8
---------------------------
9
The entire backend is built upon the premise that the local
10
gnucash process acts only as a 'cache' of (some of) the
11
'true' data in a remote SQL database. The motivation of
12
this design is to allow the local process to quickly access
13
and manipulate data (because it is local), without loosing
14
the benefits of a remote, shared, persistent storage
15
server (the sql database). If the code seems complex,
16
that is because it is expending a considerable amount of
17
effort in trying to maintain the local cache consistent
18
with the remote store. In the following text (and in the
19
notes in the source code), "the engine" is used as a
20
synonym for "the cache", and always refers to the
21
local data running in the local process.
24
Session Table, Session Modes
25
----------------------------
26
There are four basic modes for accessing the database: "Single User
27
File Mode", "Single User Update Mode", "Polled Multi-User" and
28
"Event-Driven Multi-User". The session table in the database
29
indicates which mode the database is functioning.
31
-- "Single User File Mode" --
33
Only one user can have access to the database at a time. The
34
database is used as a glorified file: engine data is 'saved' to
35
the database only when the user selects 'save' from the GUI
36
dialog. Just as it would be for a file, this mode erases the
37
contents of the database and over-writes it with the new data.
39
This mode exists because it is easy to implement, easy
40
to debug, and has reasonable demands on the database for small
41
datasets. It is not efficient for large datasets, and has
42
a certain amount of risk in its use: because it erases the
43
old data before writing the new data, it is prone to any
44
problems during the 'save'. For example, if the database
45
crashes during the write (very unlikely), data could be lost.
46
If GnuCash has some certain kind of bug (possible), not all
47
of the data may be saved. If GnuCash crashes before a save,
48
then any changes since the last save would be lost.
50
This mode is mutually exclusive of any other access mode:
51
a lockout mechanism prevents other users from accessing the
52
database while it is open in single-user mode.
54
-- "Single User Update Mode" --
56
Only one user can have access to the database at a time. Updates
57
to data are stored in the DB as they are made in the GUI; there
58
is no need to perform a mass-save at the end of a session.
60
When a database is opened in this mode, *all* of the database
61
data is copied from the database into the local engine. This
62
might result in the gnucash process getting quite large,
63
especially if the database is large.
65
This mode is more robust in that there is minimal/no data loss in
66
the event of a crash. If the database needs to be accessed in
67
single-user mode, this is the preferred mode to use.
69
-- "Multi-User Polled" --
71
Multiple users are assumed, GnuCash polls the database to detect
72
changes in the data. Data storage is incremental, essentially
73
the same as in the single-update mode. Data fetch is also
74
incremental: only the data that is immediately being viewed/edited
75
is pulled into the local engine. The idea is that the SQL database
76
may be quite large, but that only some smaller amount of data will
77
be viewed/edited by a user at any given time. Thus, we save
78
on network traffic and local memory usage by only pulling in
79
the data that we need. Note that 'old' data is not purged, so
80
if a gnucash session runs for a long time in this mode, it may
81
pull in significant amounts of the dataset, and thus grow quite
84
Multiple users may simultaneously edit the data in this mode.
85
However, in this mode, changes made by remote users do not
86
'automatically' show up as they are made. To see other users
87
changes, one has to (for example) close a register window, and
88
reopen it. The local cache of data is synced with the remote
89
database on an as-needed basis: for example, if the local
90
user tries to edit the same transaction that some remote user
91
has modified, then the local copy of transaction will be updated,
92
and the user warned that a remote change has occurred.
94
-- "Multi-User Events" (Default Mode) --
96
Multiple users are assumed. This mode works essentially the
97
same way as the multi-user-poll mode, except that (asynchronous)
98
events are delivered to the local process when remote users
99
make changes. This will cause the local GUI display to automatically
100
update when remote users make changes. (In the current design,
101
there is a 10 second delay between updates).
103
This automatic update puts slightly more burden on the local
104
process, on the network, and on the SQL server. Thus, it won't
105
scale well with lots of users. We don't know where that limit
106
is; we are guessing its somewhere between 5 and 20 simultaneous
107
users. Depends on the power of the DB server, of course.
112
There is a safety lockout that prevents a database that is
113
currently open in single-user mode from being opened by a
114
second user at the same time. Similarly, a database that
115
is currently open in multi-user mode cannot be opened in
118
Note, however, that switching modes after all users have
119
logged off is perfectly safe: If all the multi-users
120
log off, then the database can be opened in single-user mode
121
(and vice-versa). Logoff happens 'automatically' when
122
a user exits gnucash.
124
If gnucash dies or is killed, a user might be left logged
125
on. If it dies in multi-user mode, it may seem that many
126
users are still logged on; this will prevent the database
127
from subsequently being opened in single-user mode.
129
To force a log-off, you can issue the following command:
131
echo "UPDATE gncsession SET time_off='NOW' WHERE time_off = 'infinity';" | psql dbname
133
Just be sure that all users really are logged off when
139
Shows logon, logoff time, login name and the users 'true name',
140
and the FQDN hostname.
144
The pgendSyncSingleFile() subroutine performs the equivalent of 'file
145
save'. Note that it does this by deleting the entire contents of the
146
database, and then writing out the entire contents of the engine. It
147
works this way (needs to work this way) in order to make sure that
148
deleted transactions,etc. are really deleted from the database. This
149
is because in this mode, the backend never finds out about deletions.
150
If you want incremental deletion, then use the 'Single Update' mode.
153
Connecting to Postgres
154
----------------------
155
The Postgres API requires a database to connect to. The initial
156
connect is made using the "template1" database, which is the default
157
database that is always created when Postgres is installed. Thus,
158
we assume its always present.
161
Multi-User-Poll Pseudocode
162
--------------------------
163
Specifically, this scenario:
165
1. User A loads a session for the data.
166
2. User B loads a session for the data.
168
At this point, both users have complete
169
account trees for the original data.
171
3. User A adds some new accounts with
173
4. User B runs a transaction query that
174
picks up the new transactions created
177
At step 4, there is currently a bug; but the following should happen:
179
At step 4, B will notice that there are splits in
180
the accounts that it does not know about. B will then go
181
to the database and get these accounts, and create them
187
Some of the code is auto-gen'ed from m4 macros. This mostly just
188
simplifies some rather repetitive, cut-n-paste code that's identical
189
from function to function. If you can think of a better way, let me
195
The GUI and the engine support all any characters within a string; however,
196
in SQL some characters are reserved. These reserved characters are escaped
197
in builder.c routine sqlBuilder_escape() before storage. These convert
198
single-quotes and backslashes to escaped quotes & backslashes to prevent
204
Storage of KVP values in the sql database is treated more or less as
205
described in the main KVP docs. The hierarchical structure is converted
206
into 'paths' by concatenating key names, and using / as the separator.
207
(Thus, paths look like file-paths). The root of each frame is
208
associated with a guid (and thus, a url kvp://12341234/some/kvp/keys,
209
where 12341234 is the guid).
211
The implementation caches the paths, associating a 32-bit inode number
212
with each path. Caching is done because the same path names will recur
213
frequently for different guids (e.g. /reconcile-info/last-date will
214
occur in most accounts).
216
The implementation also caches guids (associating a unique 32-bit int
217
with each), although the utility of this is a bit dubious. But hey, it
218
works. It saves a little bit of storage.
220
The actual values are stored in one of 6 different tables, depending on
221
the type. Note that the binary type and the glist type are not currently
222
implemented. The glist type could be implemented, as long as the glist
223
only stored strings ... The binary type could be implemented with blobs.
228
Both the Account structure, and the Transaction structure, have version
229
numbers in them. These are used to compare the sql and the engine
230
contents, and update the one or the other as appropriate. Version
231
numbers would not be necessary for single-user access, but are important
232
for multi-user access, where several engines must be kept in sync with
233
the database contents. An alternative to version numbers might have
234
been the date of the last update. However, version numbers are better
235
than dates in the case where the engines reside on machines whose clocks
236
are not closely synchronized. (e.g. which may happen if the machines
237
are not using NTP for time synchronization; or, e.g. if one machine failed
238
to have daylight-savings time set correctly: its transactions would be
239
an hour newer/older than the others, leading to bad updates).
241
/* The pgendAccountCompareVersion() routine compares the version
242
* number of the account in the engine and the sql database. It
243
* returns a negative number if the sql version is older (or the
244
* account is not present in the sql db). It returns a positive
245
* number if the sql version is newer. It returns zero if the
249
Version numbers need to be written to XML file
253
The engine is structured so that whenever the GUI issues a query, that
254
query is passed to the backend. (The engine assumes the worst: that
255
the engine data cache is out of date and needs to be upgraded.)
256
Unfortunately, the GUI frequently queries for the same data several
257
times in rapid succession. If this is taken at face value, then
258
multiple redundant queries to the SQL server occur in quick succession.
260
The version_check field is used to minimize these redundant queries.
261
It stores a timestamp; if the timestamp is less than 10 seconds old
262
(MAX_VERSION_AGE), then the backend assumes that the engine data is
263
sufficiently recent, and the sql query is skipped. Under certain
264
situations, a considerable amount of querying can be avoided.
269
The backend keeps an audit trail of created, modified and deleted
270
transactions. These are stored in s set of tables inheriting from
271
the gncAuditTrail table. Identified are the date of modification,
272
the the session id, and the type of change.
274
There is currently no support within the GUI to view the audit trails,
275
nor is there any way to rollback to some previous state based on this
278
The audit trails are used internally to resolve certain multi-user
279
editing conflicts, for example, when one user attempts to edit a
280
transaction that has been deleted by another.
282
They are also used to discover changes that other users have made,
283
and thus update the local GUI display. This is done in the
284
pgendProcessEvents() routine and specifically, in the get_event_cb()
290
The GUI displays a running balance in the register display. When the
291
engine has a copy of all data, this is easy to compute. However, if
292
the dataset is large, then we don't want the engine to have a copy of
293
all of the data; we want to leave the bulk of it in the database.
294
However, that presents a problem for computing the running balances.
295
We could store a running balance with each journal entry. However,
296
this has the potential of making balance updates slow: potentially
297
a lot of entries would need to be updated.
299
As an alternate technique, we store running balances in a set of
300
'checkpoints', each showing a subtotal balance for a date interval.
301
Unfortunately, there is quite a bit of machinery that needs to be
302
implemented in order to make this effective.
306
Account balances can be computed using advanced SQL statements.
307
The basic idea looks like this:
310
SET balance = (SELECT sum(expr) from .... WHERE
311
txdate between today - 7 and today)
313
The above is not a valid SQL statement; below is one that actually
314
works. Note that this statement updates *all* checkpoints for the
315
indicated accountguid.
318
SET balance = (SELECT sum(gncentry.amount)
319
FROM gncentry, gnctransaction
321
gncentry.accountguid = gnccheckpoint.accountguid AND
322
gncentry.transguid = gnctransaction.transguid AND
323
gnctransaction.date_posted BETWEEN date_xpoint AND
325
WHERE accountguid='111';
327
Its a better to create a function that does the computation:
329
CREATE FUNCTION gncsubtotal (char(32), datetime, datetime)
331
AS 'SELECT sum(gncentry.amount)
332
FROM gncentry, gnctransaction
334
gncentry.accountguid = $1 AND
335
gncentry.transguid = gnctransaction.transguid AND
336
gnctransaction.date_posted BETWEEN $2 AND $3'
339
and use it like this:
342
SET balance = (gncsubtotal (accountGuid, date_start, date_end ))
343
WHERE accountguid='4c9cad7be044559705988c63ea7affc5';
345
We can find dates for creating checkpoints like so:
347
SELECT gnctransaction.date_posted
348
FROM gnctransaction, gncentry
350
gncentry.transguid = gnctransaction.transguid AND
351
gncentry.accountguid='4c9cad7be044559705988c63ea7affc5'
352
ORDER BY gnctransaction.date_posted ASC
355
----------------------------
359
If you find you need to change the structure of the sql tables, then
360
note that there is a fully general automatic upgrade mechanism. Its
361
in upgrade.c, upgrade.h.
363
Upgrades are treated like patches; the sql db is 'patched'. Each patch
364
carries three version numbers: major, minor, rev. A client can work
365
with a database only if the client's & database's major verion numbers
366
are equal, and the client's minor number is equal or newer than the db.
367
The third number, the rev number, is irrelevant if the above condition
368
is met. The rev number is handy only for making sure that patches are
369
applied in a specified order.
371
The gncVersion table stores these three numbers. It also stores a
372
human-readable text string, so that a sysadmin can review the installed
375
Most of the contents of of upgrade.c is a mechanism to make sure that
376
'ad hoc' upgrades are installed in the appropriate order; i.e. that
377
the upgrade process stays 'safe' and backwards-compatible. The login
378
process in PostegresBackend.c is structured so that older databases
379
are detected: the GUI should pop up a message asking the user if they
380
want to upgrade or not.
382
If the user wants to upgrade, then the pgendUpgradeDB() routine is
383
to be called. This routine is a set of nested case statements that
384
compare version numbers, and apply patches as needed. As of this
385
writing, there is only one upgrade: 'put_iguid_in_tables()'. Note
386
how 'put_iguid_in_tables()' is written in a simple ad-hoc manner.
387
That's because everything else in upgrade.c is focused on trying to
388
figure out when its appropriate to call 'put_iguid_in_tables()'.
389
Other upgrades should follow this same pattern: create the adhoc
390
routine, and plug it into a case statement in the pgendUpgradeDB()
391
call. Everything else is automatic.
393
This upgrade process only applies to newer clients connecting to
394
older databases. Otherise, if the client is creating a fresh, brand
395
new db, then one does not need to upgrade: put the newest db design
396
into table-create.sql, as usual, and stick in the version number
397
into table-version.sql
401
-----------------------------------------------------------------------