1
\documentclass[a4paper]{jpconf}
4
\title{MAUS: MICE Analysis User Software \jpcs}
6
\author{M. Jackson (EPSRC)\footnote{We would like to acknowledge the assistance of the Software Sustainability Institute. The work carried out by the SSI is supported by EPSRC through grant EP/H043160/1.}, D. Rajaram (IIT) and C.D. Tunnell (U. Oxford) \footnote{test}}
10
\ead{michaelj@epcc.ed.ac.uk, \{durga,tunnell\}@fnal.gov}
13
The Muon Ionization Cooling Experiment (MICE) has developed the MICE Analysis User Software (MAUS) to simulate and analyze experimental data. It serves as the primary codebase for the experiment and provides, for example, online data quality checks and offline batch simulation and reconstruction. The code is structured in a framework inspired by Map-Reduce to allow parallelization whether on a personal machine or in the control room. Various software engineering practices from industry are also used to ensure correct and maintainable physics code, which include unit, functional and integration tests, continuous integration and load testing, code reviews, and distributed version control systems. Lastly, there are various small design decisions like using JSON as the data structure, using SWIG to allow developers to write components in either Python or C++, or using the SCons python-based build system that may be of interest to other experiments.
16
%\section{Introduction}
17
%1. Overview of MICE goals\\
18
%2. MICE software requirements and goals\\
19
%3. Software framework\\
20
%4. Software components\\
22
%6. Lessons from Industry \\
25
\section{The MICE Experiment}
27
The Muon Ionization Cooling Experiment (MICE) is based at the Rutherford Appleton Laboratory and aims to demonstrate the reduction of phase space for muon beams. This R\&D is important for ensuring efficient operation of proposed accelerators using muons beams (See \cite{c:IDR} and references there-in) and there have been various phases of this R\&D program: over the last decade there have been numerous construction phases interweaved with running periods. Given the experiment is constantly changing and the long time-scales inherent with conducting such an R\&D, the need arises to ensure long-term correctness and maintainability of the software used for the experiment. This goal is complicated by the 5 detector technologies used within the experiment.
29
\section{Software Requirements}
31
The MICE analysis software must simultaneously be both particle physics code and accelerator physics code. The particle physics functionality includes features like simulating electronics respond or reconstructing tracks, whilst the accelerator physics requires the computation of transfer matrices and Twiss parameters. Both types of functionality require knowledge of, for example, the magnetic fields and geometry thus requiring a single software scope.
33
The requirements imposed upon the software were previously addressed by the G4MICE package \cite{g4mice}, created in 2002. Test coverage and documentation were missing for the much of the code base making development, use and verification of the code challenging. This is a frequent problem in physics and industry. The extraction principle outlined in \cite{refactoring} was used to refactor the code, where \emph{refactoring} is the systematic process of restructuring code to address changing specifications.
35
Since it was inefficient to attempt to understand the previous code due to lost expertise, the code was frozen such that no changes were allowed and wrappers were written such that the code could interface with a new Python framework. To improve the project, small pieces of code were gradually written to replace the old frozen functionality. These new codes had quality requirements: good comments, a style guide, and tests. This made it possible to slowly improve the quality and maintainability of the code base while retaining existing functionality.
39
The MICE Analysis User Software (MAUS) has been official MICE software since 2010 and has prepared MICE for more complex data taking scenarios with higher data rates \cite{maus}. The goal was to restructure the code into a Map-Reduce inspired \cite{mapreduce} data flow in order to simplify the interfaces that developers have to follow and aid running the code in parallel. It was felt that Map-Reduce parallelizes particle physics problems in a useful fashion but the API was simplified to have \emph{transformers} and \emph{mergers} instead of maps and reduces.
41
The basic unit of information is ``spill-level", which corresponds to a single beam extraction. Spills are independent, thus simplifying parallelization. For example, each ``transform'' should process a spill by converting the binary DAQ output to a processable data structure and then applying a track fitting routine. A similar thing can be done to Monte Carlo (MC) simulation of the apparatus. ``Merger'' allows functionality that requires access to the entire data set of a single spill: evolution of parameters over time, making histograms, and so forth.
43
The JSON data structure is used to represent a spill in order to aid developers in extending it and users in understanding it. An example spill input is:
73
\includegraphics*[width=168mm]{outfile}
74
\caption{An visual representation of a MAUS control macro that illustrates the data flow.}
78
and an example macro that controls MAUS is:
83
# File with particles to simulate
84
my_input = MAUS.InputJSON("evts.json")
86
# Create an empty array of maps, then
87
# populate it with the functionality you
89
my_map = MAUS.MapGroup()
91
# Add geant4 Monte Carlo simulation
92
my_map.append(MAUS.MapSimulation())
94
# Add electronics models
95
my_map.append(MAUS.MapTOFDigitization())
96
my_map.append(MAUS.MapTrackerDigitization())
98
# Create set of standard demo plots
99
my_reduce = MAUS.ReduceMakeDemoPlots()
101
# Where to save output?
102
filename = `simulation.out'
104
# Create uncompressed file object.
106
output_file = open(filename, `w')
108
# Then construct a MAUS output component
109
my_output = MAUS.OutputJSON(output_file)
111
# The Go() drives all the components you
112
# pass in, then check the file defined
113
# above for the output
114
MAUS.Go(my_input, my_map,
115
my_reduce, my_output)
119
where dataflow in MAUS is illustrated in Fig.~\ref{dataflow}. The macro language is Python but components can be written in either Python or C++.
121
SWIG \cite{swig} is used to make Python bindings to C++ code which are created automatically. The experience with using SWIG is a bit mixed: SWIG is good for well-defined APIs like that defined by the transforms and merges but it proved difficult to use SWIG to reveal common C++ routines that would have been useful within Python code (ex. magnetic fields). The Boost::python libraries seem to be easier to use for the HEP use-case but their difficulty to install made them infeasible to use.
123
\section{Applying Lessons from Industry}
125
Knowledge gained within industry can be applied and enable the project to run more smoothly. Various industry procedures were tested in developing MAUS.
127
\subsection{Project Management and Issue Tracker}
129
A paradigm shift in how the code was written came naturally with the change in how the project was managed. Before any code was written, a project management website was set up that included a wiki and issue tracker using the Redmine software. This allowed people to keep track of task assignment, current bugs, and feature requests. It is a vital tool for establishing the status of various blocks of work while simultaneously providing a useful historical record and institutional memory. The decision to use Redmine seems to have been a good one as the ease of its use allowed for the expansion of this project management tool to other parts of the MICE project.
131
\subsection{Code Reviews}
133
Code reviews are standard practice in industry but rare within physics mostly due to the man-power limitations within the software projects. New code requires an hour of review before entering the trunk with other communal code. In addition to tracking down bugs, the review process also helps spread knowledge of the project between developers. It helps people learn from one another while simultaneously decreasing the reliance on specific developers.
135
\subsection{Static Code Analysis}
137
Static code analyzers such as Coverity \cite{coverity} were used. The purpose of this type of tool is to inspect code to determine if there are conditions the code can enter that may lead to unexpected behavior. For example, if a variable is not initialized and there is a way for the variable to be used that would lead to a segmentation fault, this tool will alert the user.
139
The static code analyzer finds problems of varying degrees of severity and human intervention is required to categorize them. Given that, it is inefficient to use static code analyzers on legacy code since it takes an unrealistic amount of time to process the wide range of errors. The optimal use case is only to rectify problems observed in new code, thus incrementally improving the code base.
141
However, it was decided to abandon static code analysis since the gains did not merit the time required.
143
\subsection{Tests and Continuous Integration}
145
Unit tests are small pieces of code that test other small pieces of code. They are meant to be granular, deterministic, and repeatable. Their purpose is to allow the person developer to know if they have broken preexisting code. These tests aid in creating releases since one can verify that the code is still functional. If bugs are found, new tests are added to make sure the bug never resurfaces. This type of development also allows one to quickly narrow down the source of a problem. The unit test coverage within MAUS has been useful for developers to know when a piece of code has broken, but most importantly unit tests help remove the fear of changing code that does not ``belong'' to them.
147
The entire system is checked by integration tests that execute at the
148
application level. For example, a large statistics simulation could be
149
used to verify that physics quantities have not changed within
150
statistical uncertainties. These have proven to be the most useful tests since they help ensure that the physics does not change.
152
Jenkins \cite{jenkins} performs continuous integration tests of the code. This tool
153
runs the test suite in a number of different installation environments
154
every time code is committed. A distributed version control system
155
called Bazaar \cite{bzr} is used and code from every user is tested before it
158
These tools have been vital to the project since developers are alerted
159
to broken code. Jenkins is able to try to compile and test the code on a
160
wide range of Linux and Mac platforms to ensure that the code can be
161
deployed to any system. Continuous integration complements unit and integration tests
162
because the frequent running of unit tests allows code developers to know instantly where and when a problem was introduced into the code base.
164
\subsection{Release Cycle}
166
Code that has been tested as described above is periodically released. Major releases occur every few months and minor releases are biweekly. The limiting factor on the timescale for minor releases is how long it takes to develop and test new code. This quick release cycle means that bugs are quickly resolved.
171
The MAUS effort within the MICE experiment has proven to be a successful collaboration physicists and software engineers. There are currently about ten active developers working as a team and using a wide range of tools and methods. A wealth of knowledge and experience exists in the software engineering community and taking advantage of that knowledge has helped MICE.
173
%\section{Figures and figure captions}
174
%Figures must be included in the source code of an article at the appropriate place in the text %not grouped together at the end.
176
%Each figure should have a brief cap%tion describing it and, if
177
%necessary, interpreting the various lines and symbols on the figure.
178
%As much lettering as possible should be removed from the figure itself and
179
%included in the caption. If a figure has parts, these should be
180
%labelled ($a$), ($b$), ($c$), etc.
181
%\Tref{blobs} gives the definitions for describing symbols and lines often
182
%used within figure captions (more symbols are available
183
%when using the optional packages loading the AMS extension fonts).
186
%\caption{\label{blobs}Control sequences to describe lines and symbols in figure
189
%\begin{tabular}{lllll}
191
%Control sequence&Output&&Control sequence&Output\\
193
%\verb"\dotted"&\dotted &&\verb"\opencircle"&\opencircle\\
194
%\verb"\dashed"&\dashed &&\verb"\opentriangle"&\opentriangle\\
195
%\verb"\broken"&\broken&&\verb"\opentriangledown"&\opentriangledown\\
196
%\verb"\longbroken"&\longbroken&&\verb"\fullsquare"&\fullsquare\\
197
%\verb"\chain"&\chain &&\verb"\opensquare"&\opensquare\\
198
%\verb"\dashddot"&\dashddot &&\verb"\fullcircle"&\fullcircle\\
199
%\verb"\full"&\full &&\verb"\opendiamond"&\opendiamond\\
206
%Authors should try and use the space allocated to them as economically as possible. At times it %may be convenient to put two figures side by side or the caption at the side of a figure. To put f%igures side by side, within a figure environment, put each figure and its caption into a %minipage with an appropriate width (e.g. 3in or 18pc if the figures are of equal size) and then %separate the figures slightly by adding some horizontal space between the two minipages (e.g. %\verb"\hspace{.2in}" or \verb"\hspace{1.5pc}". To get the caption at the side of the figure add %the small horizontal space after the \verb"\includegraphics" command and then put the %\verb"\caption" within a minipage of the appropriate width aligned bottom, i.e. %\verb"\begin{minipage}[b]{3in}" etc (see code in this file used to generate figures 1--3).
208
%Note that it may be necessary to adjust the size of the figures (using optional arguments to %\verb"\includegraphics", for instance \verb"[width=3in]") to get you article to fit within your %page allowance or to obtain good page breaks.
211
%\begin{minipage}{14pc}
212
%\includegraphics[width=14pc]{name.eps}
213
%\caption{\label{label}Figure caption for first of two sided figures.}
214
%\end{minipage}\hspace{2pc}%
215
%\begin{minipage}{14pc}
216
%\includegraphics[width=14pc]{name.eps}
217
%\caption{\label{label}Figure caption for second of two sided figures.}
222
%\includegraphics[width=14pc]{name.eps}\hspace{2pc}%
223
%\begin{minipage}[b]{14pc}\caption{\label{label}Figure caption for a narrow figure where the %caption is put at the side of the figure.}
227
%Using the graphicx package figures can be included using code such as:
231
%\includegraphics{file.eps}
233
%\caption{\label{label}Figure caption}
237
\section*{References}
238
\begin{thebibliography}{9}
239
\bibitem{iopartnum} IOP Publishing is to grateful Mark A Caprio, Center for Theoretical Physics, Yale University, for permission to include the {\tt iopart-num} \BibTeX package (version 2.0, December 21, 2006) with this documentation. Updates and new releases of {\tt iopart-num} can be found on \verb"www.ctan.org" (CTAN).
240
\end{thebibliography}