~ubuntu-branches/debian/sid/trac-bzr/sid

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
=======================
 Implementation notes:
=======================

General
=======

The format of trac "revs" is in flux, but is closer to bzr revision
ids than to bzr revision numbers. This is needed to make multiple
branch support work and may allow showing more useful things (think
"subdiffs" of merges). There are a couple of annoyances:

- trac revs are not quite strings: the "@" character is special and
  they need to be usable as parameter in a http GET request. So we
  need to escape a bunch of stuff. The intention originally was to use
  normalize_rev but I think that will not really work very well since
  trac seems to *sometimes* hand you normalized revs and *sometimes*
  raw revs (meaning "normalizing" something twice has to be safe if we
  use it for escaping). So instead we keep our "actual" revids
  completely contained, only giving trac escaped revids, unescaping
  before handing them to bzrlib.

- trac wants to have a total ordering on revs. If we were using revnos
  this would of course be trivial but with revids it is not,
  especially if we start mixing branches. Will have to sort that out
  at some point. Currently we determine which one is an ancestor of
  the other, falling back to a time-based comparison if this fails.

- trac occasionally compares path values, which means we have to be
  careful about "normalizing" those. Currently the format is the one
  bzr hands us: always omit the starting "/" (which makes the root "").
  What trac uses for svn is "/" for the root, "foo/bar" for everything
  else. Hopefully it will cope with our usage. trac also occasionally
  passes a None path, which is taken as meaning the root ("").
  (Example of where this bit in the past: if Node.get_history changes
  the path it returns trac interprets this as a move, even if we do not
  pass the corresponding constant. This means that if the Node is
  constructed from a non-normalized path starting with a "/" and we switch
  to a path with no starting "/" when the node *content* first changes
  trac picks it up as a move.)
  Again trac may pass in a normalized path, so make sure normalizing twice
  is safe. Notice that we do not always normalize passed in paths, since
  bzr should handle anything trac hands us. We just need to make sure
  everything that escapes back to trac is normalized.

- trac wants the revision log of a dir to include what happens to its
  contents. bzr only notices changes to the dir itself (renames and moves
  of the dir, not its contents).

  One case where this is very important is get_previous of the root
  node. Since you can't rename or move the root this simply does not
  exist for bzr. Trac uses this to get the previous revision to
  display a changeset, so we really need to do this in the trac (svn)
  way.

  Another case is the revision attribute of BzrDirNode. If we pick the
  entry.revision for that we do not see the changes in subdirectories
  in the directory listings (as people expect from bzr/svn). If we
  pick the revisiontree revision id (which is the one the user is
  using to find us) then all subdirectories get the same log message
  (the one of the most recent commit to the tree, not to something
  under that subdir).

Performance
===========

Constructing a BzrDirNode is slow because it walks its children to
determine the right "most recent" revision. Because it is pretty
common to iterate over the children afterwards and we need to
determine their most recent revision anyway we cache those values
(BzrDirNode.revcache). Speeds up the source browser in a directory
with a couple of subdirs.

BzrDirNode.get_history is a bit slow and pretty memory-hungry. Again
the need to manually pick up changes to children is the root cause.
It has to construct full inventories and/or deltas between revisions
to pick up changes to children, while for the other node types we just
have to open the "versionedfile" for that particular file.

Probably the most questionable optimization is calling lock_read in
BzrRepository.__init__. This provides a *very* noticable speed boost
by keeping the branch locked for the entire web request (try timing a
"log" of the root dir, which pulls in a hundred BzrChangesets). And at
a glance it looks like trac was designed with this thing in mind: it
has a close method for "closing the connection". Unfortunately it does
not actually call this method... We have a __del__ method to try to
improve the chance the branch is unlocked, and bzr has one too, and
this seems to be working so far. It is probably not really reliable
though.

Testsuite
=========

The testsuite can be run using one of the many Python testsuite runners.
A good one is "trial" from the Twisted folks. To use it, run:

$ trial tracbzr