~chris-rogers/maus/emr_mc_digitization

% With the commands after % signs you can define your own page size. Remove % to activate them and insert the values you want to define the size of the page you want.

%\voffset=-2cm

%\hoffset=-0.5cm

%\textwidth=15cm

%\parindent 0pt

%\parskip 2ex

\usepackage{subfig}

\usepackage{color}

\usepackage[usenames,dvipsnames,svgnames,table]{xcolor}

\usepackage[noindentafter]{titlesec}

\addtocounter{tocdepth}{1} % Increase maximum TOC level by one

\addtocounter{secnumdepth}{1} % Increase maximum section level by one

% Make paragraphs behave like subsubsections

\titleformat{\paragraph}[hang]{\sf\bfseries\normalfont\bf}{\thetitle\quad}{0pt}{}

\titlespacing{\paragraph}{0pt}{1em}{0.5em}

\usepackage{graphicx}

\usepackage{multirow}

\usepackage{rotating}

\usepackage{chngpage}

\usepackage{amsmath}

\usepackage{listings}

\newcommand{\HRule}{\rule{\linewidth}{0.5mm}}

\usepackage{abstract}

\renewcommand{\abstractname}{} % clear the title

\renewcommand{\absnamepos}{empty} % originally center

\begin{document}

\title{Global PID Framework Documentation}

\author{Celeste Pidcott}

\date{}

\maketitle

%\tableofcontents

%\clearpage

%\listoffigures

%\listoftables

%\clearpage

%\begin{abstract}

%\end{abstract}

\section{Introduction}

\label{intro}

The global PID framework is designed to use sets of PID variables to 1) use MC data to create PDFs of these variables for a range of particle hypotheses, and 2) to use the PDFs as part of a log-likelihood method to determine the PID of reconstructed global tracks from data. The framework is designed such that new PID variables can be added as they are developed. Section 1 of this document will explain how to use the PID to produce PDFs, and how to perform PID on spill data contained within a Json document. Section 2 will detail how these two actions are performed within the code, and in Section 3 the PID variables, their structure, how new ones can be added to the framework, and details of those already in place, will be discussed. This document will be updated as the PID framework and variables continue to be developed.

\subsection{Using the PID scripts}

\label{use}

\subsection{Producing PDFs}

\label{PDFs}

Whilst the PID framework comes with PDFs provided in PIDhists.root, it is possible for a user to produce PDFs for hypotheses not included within this file. The following describes how this should be done.

\begin{figure}[h!]

\begin{center}

\includegraphics[width=2in]{pdfprodflow.pdf}

\caption{Steps invloved in producing a PDF from MC data}

\label{pdfprod}

\end{center}

\end{figure}

\begin{itemize}

\item Simulation: Production of MC data for a given particle

hypothesis. To produce and output the MC to a Json file, a copy of

simulate\_mice.py should be made, with the my\_output flag set to

MAUS.OutputPyJSON().

\item Global Reconstruction: The MC data should then be passed through

the global reconstruction. Detector information is currently added global tracks using the GlobalReconImport.py script in \\

\$\{MAUS\_ROOT\_DIR\}\textbackslash bin\textbackslash Global. This script calls the mapper

MapCppGlobalReconImport, which constructs the global tracks required

for the calculation of PID variables. The control variables to specify

the input Json data sample, and the name of the output Json file that

contains the reconstructed tracks can be set at the command line, or by

using another datacard, as shown in listing ~\ref{globaldatacard}. To run

the global reconstruction with the datacard, the following should be entered at the command line:

\begin{lstlisting}[breaklines=true,basicstyle=\ttfamily]

${MAUS_ROOT_DIR}/bin/Global/GlobalReconImport.py --configuration_file <name_of_datacard>

\end{lstlisting}

which for the example in ~\ref{globaldatacard} would be:

\begin{lstlisting}[breaklines=true,basicstyle=\ttfamily]

${MAUS_ROOT_DIR}/bin/Global/GlobalReconImport.py --configuration_file ex_global_datacard.py

100

\end{lstlisting}

101

102

\item PDF Production: To produce the PDFs from the reconstructed MC

103

data, pid\_pdf\_production.py in \$\{MAUS\_ROOT\_DIR\}\textbackslash

104

bin\textbackslash Global is then used. This script calls the reducer

105

ReduceCppGlobalPID. With this script, a datacard, such as that shown

106

given in listing ~\ref{pdfdatacard}, that includes the input Json filename, the global\_pid\_hypothesis for which the PDF(s) are to be produced, and a unique\_identifier (typically the time and date at which the script is run) is used by entering at the command line:

107

\begin{lstlisting}[breaklines=true,basicstyle=\ttfamily]

108

${MAUS_ROOT_DIR}/bin/Global/pid_pdf_generator.py --configuration_file example_pdf_datacard.py

109

\end{lstlisting}

110

This will create a directory within \$\{MAUS\_ROOT\_DIR\}\textbackslash files\textbackslash PID corresponding to the hypothesis and identifier given by the datacard, which will then contain files for each PID variable, each of which will contain the PDF for that hypothesis and variable.

111

\end{itemize}

112

113

\vspace*{1\baselineskip}

114

115

\begin{lstlisting}[language=Python,basicstyle=\ttfamily,frame=single,breaklines=true,commentstyle=\color{gray}, keywordstyle=\color{red}\bfseries,stringstyle=\color{green!50!black},captionpos=b,caption={An example datacard (ex\_global\_datacard.py) for use with GlobalReconImport.py},label=globaldatacard]

116

import os

117

118

# A json document containing spills from MC data

119

input_json_file_name = "example_hypothesis.json"

120

input_json_file_type = "text"

121

122

# The json document that the global tracks will be

123

# written to

124

output_json_file_name =

125

"example_hypothesis_Global_Recon.json"

126

output_json_file_type = "text"

127

\end{lstlisting}

128

129

\vspace*{2\baselineskip}

130

131

\begin{lstlisting}[language=Python,basicstyle=\ttfamily,frame=single,commentstyle=\color{gray}, breaklines=true,keywordstyle=\color{red}\bfseries,stringstyle=\color{green!50!black},captionpos=b,caption={An example datacard (example\_pdf\_datacard.py) for use with pid\_pdf\_generator.py},label=pdfdatacard]

132

import os

133

import datetime

134

135

# Use the current time and date as a unique

136

# identifier when creating files to contain PDFs.

137

# A unique_identifier is required by the reducer,

138

# and PDF production will fail without one.

139

now = datetime.datetime.now()

140

unique_identifier =

141

now.strftime("%Y_%m_%dT%H_%M_%S_%f")

142

143

# A json document containing global tracks from MC

144

# data

145

input_json_file_name =

146

"example_hypothesis_Global_Recon.json"

147

input_json_file_type = "text"

148

149

# The particle hypothesis that the PDF is being

150

# created for. A global_pid_hypothesis is required

151

# by the reducer, and PDF production will fail

152

# without one.

153

global_pid_hypothesis = "example"

154

\end{lstlisting}

155

156

\subsubsection{Performing PID with pre-existing hypotheses}

157

\label{perf}

158

To perform PID on data, the steps shown figure ~\ref{pidperf} should be followed.

159

160

\begin{figure}[h!]

161

\begin{center}

162

\includegraphics[width=2in]{pidperfflow.pdf}

163

\caption{Steps invloved in performing the PID for a data sample}

164

\label{pidperf}

165

\end{center}

166

\end{figure}

167

168

\begin{itemize}

169

\item Data: This can be experimental or MC data, however the spill data must be passed to the PID in a Json document.

170

\item Global Reconstruction: In the same way as described above, the

171

data should then be passed through the global reconstruction,

172

currently using the GlobalReconImport.py script in \$\{MAUS\_ROOT\_DIR\}\textbackslash

173

bin\textbackslash Global, with a corresponding datacard containing the name of the input Json file and the name of the output file.

174

\item Global PID: To perform the PID on the reconstructed data, GlobalPID.py in \$\{MAUS\_ROOT\_DIR\}\textbackslash

175

bin\textbackslash Global is then used. This script calls the

176

MapCppGlobalPID mapper. With this script, a datacard, such as that

177

shown given in listing ~\ref{piddatacard}, that includes the input and output Json filenames, is used, by entering the following at the command line:

178

\begin{lstlisting}[breaklines=true,basicstyle=\ttfamily]

179

${MAUS_ROOT_DIR}/bin/Global/GlobalPID.py

180

--configuration_file

181

example_pid_datacard.py

182

\end{lstlisting}

183

\end{itemize}

184

185

\begin{lstlisting}[language=Python,basicstyle=\ttfamily,breaklines=true,frame=single,commentstyle=\color{gray}, keywordstyle=\color{red}\bfseries,stringstyle=\color{green!50!black},captionpos=b,caption={An example datacard (example\_pid\_datacard.py) for use with GlobalPID.py},label=piddatacard]

186

import os

187

188

# A json document containing spills from data

189

input_json_file_name =

190

"example_hypothesis_Global_Recon.json"

191

input_json_file_type = "text"

192

193

# The json document that the global tracks will be

194

# written to

195

output_json_file_name =

196

"example_hypothesis_Global_PID.json"

197

output_json_file_type = "text"

198

\end{lstlisting}

199

200

As the framework currently stands, the output document would now contain the global tracks with the PID set (where it has been possible to do so) to whichever particle hypothesis had the highest log-likelihood. For tracks where the PID could not be determined, the track PID will be left as 0.

201

202

\section{MapCppGlobalPID and ReduceCppGlobalPID}

203

\label{mapred}

204

\subsection{MapCppGlobaPID}

205

\label{map}

206

The steps taken in MapCppGlobalPID for a single track are shown in

207

figure \ref{mapflow}. To express this more fully, the data, having passed through the global reconstruction, is then passed to the PID. For each track, the values of each PID variable are calculated. Each of these values is then compared to the corresponding PDFs for all particle hypotheses, the number of entries in the corresponding bin providing the probability from which the log-likelihood is calculated. For each particle hypothesis, the log-likelihoods of all of the PID variables are summed to give a log-likelihood for that hypothesis. The PID of the track is then obtained by comparing the log-likelihoods of the hypotheses.

208

\begin{figure}[h!]

209

\begin{center}

210

\includegraphics[width=5in]{PIDflow.pdf}

211

\caption{Flow chart detailing steps taken in MapCppGlobaPID}

212

\label{mapflow}

213

\end{center}

214

\end{figure}

215

216

\section{ReduceCppGlobalPID}

217

\label{reducer}

218

The steps taken in ReduceCppGlobalPID are shown in figure ~\ref{reduceflow}. MC data for a given particle hypothesis, having passed through the global reconstruction, is then passed to the PID. For each track, the values of each PID variable are calculated. A histogram is filled with these values. If the behaviour has been turned on in the PID variable class, then a single event is spread over all bins in the histogram, to ensure that when the PDF is used by the PID, there will no empty bins, thus avoiding cases where the log-likelihood takes the log of zero. The histogram is then normalised to create the PDF, which is then written and saved to file.

219

If a MC track returns a variable value outside of the allowed range of the histogram (as defined within the variable class) then the value for that track is not included.

220

\begin{figure}[h!]

221

\begin{center}

222

\includegraphics[width=5in]{PDFflow.pdf}

223

\caption{Flow chart detailing steps taken in ReduceCppGlobaPID}

224

\label{reduceflow}

225

\end{center}

226

\end{figure}

227

228

\section{PID Variables}

229

\label{PID}

230

Information from the MICE detectors will be incorporated into a set of

231

PID variables that can be used to distinguish between particle

232

hypotheses.

233

The Global PID framework has been written such that any number of PID

234

variables can be developed and added as necessary, all represented by

235

their own class, derived from a base class.

236

237

\subsection{PID Base Class}

238

\label{PIDBase}

239

The base PID class (PIDBase.hh and .cc) contains the functions to:

240

\begin{itemize}

241

\item Create the PDFs (and the files that contain them)

242

\item Use the PDFs with globally reconstructed tracks

243

\item Populate the PDFs with variable values (after checking that

244

value is valid)

245

\item Perform the log-likelihood for an incoming globally reconstructed

246

track (after checking that value of variable for track falls within

247

range of PDF).

248

\item Calculate the value of the PID variable (this is a virtual

249

function to be defined in the derived classes)

250

\end{itemize}

251

252

\subsection{PID Variable Classes}

253

\label{PIDVar}

254

Each PID variable will be implemented in a derived class of the base PID class. Because of how the framework is designed, new variables can be added as they are developed.

255

256

\subsubsection{Adding PID Variables}

257

\label{addvar}

258

In each derived variable class, the following should be included:

259

\begin{itemize}

260

\item The variable name should be set

261

\item The function to calculate the PID variable should be defined.

262

\item The minimum, maximum, and number of bins for PDFs created using

263

the variable should be set. The values of the minimum and maximum

264

define the allowed range of values that the PID variable can take.

265

\item In some cases it may be necessary to ensure that all bins in a

266

PDF return non zero entries, and so by setting the variable

267

\_nonZeroHistEntries to true, a single event spread accross all bins

268

will be added

269

\end{itemize}

270

271

\subsubsection{PIDVarA}

272

\label{PIDVarA}

273

There is currently a single PID variable defined within the framework,

274

PIDVarA (see PIDVarA.hh and .cc), which uses the difference between the times measured at TOF1

275

and TOF0 as its variable. Only for tracks where there is a single TOF0

276

and a single TOF1 time measurement, and for which the time difference

277

between the detectors falls within the minimum and maximum set within

278

the class, will a valid value of the variable be returned. Otherwise,

279

the value of the variable is set to -1, such that it falls outside of

280

the allowed range for the variable, and so variable for the track is

281

not used in PDF production, or in the PID.

282

283

\clearpage

284

285

%\begin{thebibliography}{99}

286

287

%\end{thebibliography}

288

289

290

291

\end{document}

Older »