2
2
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
3
3
<html xmlns="http://www.w3.org/1999/xhtml">
5
<link rel="stylesheet" type="text/css" href="../harvest.css" />
6
<title>Hacking Harvest</title>
5
<link rel="stylesheet" type="text/css" href="../theme/css/style.css" />
6
<title>HACKING Harvest</title>
9
<h1>HACKING Harvest</h1>
10
<p>The code of harvest lives in <a href="https://code.launchpad.net/harvest">Launchpad</a> and makes use of <a href="http://python.org/">python</a> and <a href="https://storm.canonical.com/">Storm</a>.</p>
11
<p>Harvest regularly pulls data from URLs stored in <a href="https://code.launchpad.net/harvest-data">this branch</a>. The file layout is pretty simple:</p>
12
<pre>daniel@bert:~/bzr/harvest-data$ ls
10
<div id="topcap"></div>
13
<div id="logo-floater">
14
<a href="http://daniel.holba.ch/harvest/" title="Harvest">
15
<img alt="Ubuntu" id="logo" src="../theme/images/harvest.png" />
18
<form action="handler.py" id="pkg-search-box">
20
<input type="text" name="pkg" size="27" />
21
<input type="submit" value="Search Package Name" />
25
<a href="http://daniel.holba.ch/harvest/">
26
<img alt="" src="https://help.ubuntu.com/htdocs/ubuntunew/img/help-about.png" />
27
<span>Finding low-hanging fruit...</span>
34
<h1>HACKING Harvest</h1>
35
<p>The code of harvest lives in <a href="https://code.launchpad.net/harvest">Launchpad</a> and makes use of <a href="http://python.org/">python</a> and <a href="https://storm.canonical.com/">Storm</a>.</p>
36
<p>Harvest regularly pulls data from URLs stored in <a href="https://code.launchpad.net/harvest-data">this branch</a>. The file layout is pretty simple:</p>
37
<pre>daniel@bert:~/bzr/harvest-data$ ls
13
38
clues opportunities
14
39
daniel@bert:~/bzr/harvest-data$ </pre>
15
<p>Before attempting to download the CSV (comma-separated values) file, Harvest will check the <tt>Last-Modified</tt> entry in the HTTP header and see if any modifications were made. This is done to reduce traffic.</p>
16
<h2>Opportunities?</h2>
17
<p>The <tt>opportunities</tt> file is in CSV and of the following format:</p>
18
<pre><url>,<description></pre>
19
<p>The URLs to CSV files must be reachable via HTTP(s). The <tt>description</tt> is optional.</p>
20
<p>The CSV file in turn needs to be of the following form:</p>
21
<pre><sourcepackage>,<url>,<description></pre>
23
<pre>vdrift,http://launchpad.net/bugs/106854,106854</pre>
24
<p>Opportunities can be anything:</p>
27
<li>Suitable Upstream patches</li>
28
<li>Patches of other distributions</li>
29
<li>Problems in the CD builds that should be fixed</li>
32
<p>Let your imagination go wild. :-)</p>
34
<p>The <tt>clues</tt> file is in CSV and of the following format:</p>
35
<pre><url>,<score>,<description></pre>
36
<p>The URL specifies the link to another CSV file that should be pulled regularly.
37
The score is a float value that describes how good or bad it is for the package to be on the list
38
(eg. if a package is uninstallable that might be worth a -500, if 50% of the bugs are forwarded upstream that might be worth +300).
39
The scores are summed up every time the HTML pages are generated and might indicate if the package is in a good shape.</p>
40
<p>The format of the CSV file containing the <tt>clues</tt> is the same as that of the opportunities, right now only the source package name is used.</p>
41
<h2>Setting it up</h2>
43
<li><tt>sudo apt-get install libapache2-mod-python python-storm bzr</tt></li>
44
<li><tt>mkdir ~/bzr; cd ~/bzr; bzr init-repo harvest; cd harvest; bzr branch lp:harvest</tt></li>
45
<li><tt>cd /var/www; sudo ln -s ~/bzr/harvest</tt></li>
46
<li>Edit one of the files in <tt>/etc/apache2/sites-enabled/</tt> to have contain a section like this one
48
<Directory /var/www/harvest/>
49
AllowOverride FileInfo
52
</Directory></pre>
53
to make sure <tt>.htaccess</tt> in the harvest tree is used.
55
<li><tt>sudo /etc/init.d/apache2 restart</tt></li>
57
<p class="copyright">© 2008 Canonical Ltd.</p>
40
<p>Before attempting to download the CSV (comma-separated values) file, Harvest will check the <tt>Last-Modified</tt> entry in the HTTP header and see if any modifications were made. This is done to reduce traffic.</p>
41
<h2>Opportunities?</h2>
42
<p>The <tt>opportunities</tt> file is in CSV and of the following format:</p>
43
<pre><url>,<description></pre>
44
<p>The URLs to CSV files must be reachable via HTTP(s). The <tt>description</tt> is optional.</p>
45
<p>The CSV file in turn needs to be of the following form:</p>
46
<pre><sourcepackage>,<url>,<description></pre>
48
<pre>vdrift,http://launchpad.net/bugs/106854,106854</pre>
49
<p>Opportunities can be anything:</p>
52
<li>Suitable Upstream patches</li>
53
<li>Patches of other distributions</li>
54
<li>Problems in the CD builds that should be fixed</li>
57
<p>Let your imagination go wild. :-)</p>
59
<p>The <tt>clues</tt> file is in CSV and of the following format:</p>
60
<pre><url>,<score>,<description></pre>
61
<p>The URL specifies the link to another CSV file that should be pulled regularly.
62
The score is a float value that describes how good or bad it is for the package to be on the list
63
(eg. if a package is uninstallable that might be worth a -500, if 50% of the bugs are forwarded upstream that might be worth +300).
64
The scores are summed up every time the HTML pages are generated and might indicate if the package is in a good shape.</p>
65
<p>The format of the CSV file containing the <tt>clues</tt> is the same as that of the opportunities, right now only the source package name is used.</p>
66
<h2>Setting it up</h2>
68
<li><tt>sudo apt-get install libapache2-mod-python python-storm bzr</tt></li>
69
<li><tt>mkdir ~/bzr; cd ~/bzr; bzr init-repo harvest; cd harvest; bzr branch lp:harvest</tt></li>
70
<li><tt>cd /var/www; sudo ln -s ~/bzr/harvest</tt></li>
71
<li>Edit one of the files in <tt>/etc/apache2/sites-enabled/</tt> to have contain a section like this one
73
<Directory /var/www/harvest/>
74
AllowOverride FileInfo
77
</Directory></pre>
78
to make sure <tt>.htaccess</tt> in the harvest tree is used.
80
<li><tt>sudo /etc/init.d/apache2 restart</tt></li>
83
<div id="ubuntulinks">
84
<p class="copyright">© 2008 Canonical Ltd.</p>
91
<div id="bottomcap"></div>