• John
  • Felde
  • University of Maryland
  • USA

Latest Posts

  • USLHC
  • USLHC
  • USA

  • James
  • Doherty
  • Open University
  • United Kingdom

Latest Posts

  • Andrea
  • Signori
  • Nikhef
  • Netherlands

Latest Posts

  • CERN
  • Geneva
  • Switzerland

Latest Posts

  • Aidan
  • Randle-Conde
  • Université Libre de Bruxelles
  • Belgium

Latest Posts

  • TRIUMF
  • Vancouver, BC
  • Canada

Latest Posts

  • Laura
  • Gladstone
  • MIT
  • USA

Latest Posts

  • Steven
  • Goldfarb
  • University of Michigan

Latest Posts

  • Fermilab
  • Batavia, IL
  • USA

Latest Posts

  • Seth
  • Zenz
  • Imperial College London
  • UK

Latest Posts

  • Nhan
  • Tran
  • Fermilab
  • USA

Latest Posts

  • Alex
  • Millar
  • University of Melbourne
  • Australia

Latest Posts

  • Ken
  • Bloom
  • USLHC
  • USA

Latest Posts


Warning: file_put_contents(/srv/bindings/215f6720ac674a2d94a96e55caf4a892/code/wp-content/uploads/cache.dat): failed to open stream: No such file or directory in /home/customer/www/quantumdiaries.org/releases/3/web/wp-content/plugins/quantum_diaries_user_pics_header/quantum_diaries_user_pics_header.php on line 170

Posts Tagged ‘software’

There’s a software tool I use almost every day, for almost any work situation. It’s good for designing event selections, for brainstorming about systematic errors, and for mesmerizing kids at outreach events. It’s good anytime you want to build intuition about the detector. It’s our event viewer. In this post, I explain a bit about how I use our event viewer, and also share the perspective of code architect Steve Jackson, who put the code together.

Steamshovel event viewer showing the event Mr. Snuffleupagus

The IceCube detector is buried in the glacier under the South Pole. The signals can only be read out electronically; there’s no way to reach the detector modules after the ice freezes around them. In designing the detector, we carefully considered what readout we would need to describe what happens in the ice, and now we’re at the stage of interpreting that data. A signal from one detector module might tell us the time, amplitude, and duration of light arriving at that detector, and we put those together into a picture of the detector. From five thousand points of light (or darkness), we have to answer: where did this particle come from? Does the random detector noise act the way we think it acts? Is the disruption from dust in the ice the same in all directions? All these questions are answerable, but the answers take some teasing out.

To help build our intuition, we use event viewer software to make animated views of interesting events. It’s one of our most useful tools as physicist-programmers. Like all bits of our software, it’s written within the collaboration, based on lots of open-source software, and unique to our experiment. It’s called “steamshovel,” a joke on the idea that you use it to dig through ice (actually, dig through IceCube data – but that’s the joke).

Meet Steve Jackson and Steamshovel

IceCube data from the event Mr. Snuffleupagus

Steve Jackson’s job on IceCube was originally maintaining the central software, a very broad job description. His background is in software including visualizations, and he’s worked as The Software Guy in several different physics contexts, including medical, nuclear, and astrophysics. After becoming acquainted with IceCube software needs, he narrowed his focus to building an upgraded version of the event viewer from scratch.

The idea of the new viewer, Steamshovel, was to write a general core in the programming language C++, and then higher-level functionality in Python. This splits up the problem of drawing physics in the detector into two smaller problems: how to translate physics into easily describable shapes, like spheres and lines, and how to draw those spheres and lines in the most useful way. Separating these two levels makes the code easier to maintain, easier to update the core, and easier for other people to add new physics ideas, but it doesn’t make it easier to write in the first place. (I’ll add: that’s why we hire a professional!) Steve says the process took about as long as he could have expected, considering Hofstadter’s Law, and he’s happy with the final product.

A Layer of Indirection 

As Steve told me, “Every problem in computer science can be addressed by adding a layer of indirection: some sort of intermediate layer where you abstract the relevant concepts into a higher level.” The extra level here is the set of lines and spheres that get passed from the Python code to the C++ code. By separating the defining from the drawing, this intermediate level makes it simpler to define new kinds of objects to draw.

A solid backbone, written with OpenGL in C++, empowers the average grad student to write software visualization “artists” as python classes. These artists can connect novel physics ideas, written in Python, to the C++ backbone, without the grad student having to get into the details of OpenGL, or, hopefully, any C++.

Here’s a test of that simplicity: as part of our week-long, whirlwind introduction to IceCube software, we taught new students how to write a new Steamshovel artist. With just a week of software training, they were able to produce them, a testament to the usability of the Steamshovel backbone.

This separation also lets the backbone include important design details that might not occur to the average grad student, but make the final product more elegant. One such detail is that the user can specify zoom levels much more easily, so graphics are not limited to the size of your computer screen. Making high-resolution graphics suitable for publication is possible and easy. Using these new views, we’ve made magazine covers, t-shirts, even temporary tatoos.

Many Platforms, Many People

IceCube has an interesting situation that we support (and have users) running our software on many different UNIX operating systems: Mac, Ubuntu, Red Hat, Fedora, Scientific Linux, even FreeBSD. But we don’t test our software on Windows, which is the standard for many complex visualization packages: yet another good reason to use the simpler OpenGL. “For cross-platform 3D graphics,” Steve says, “OpenGL is the low-level drawing API.”

As visualization software goes, the IceCube case is relatively simple. You can describe all the interesting things with lines and spheres, like dots for detector modules, lines and cylinders for the cables connecting them or for particle tracks, and spheres of configurable color and size for hits within the detector. There’s relatively little motion beyond appearing, disappearing, and changing sizes. The light source never moves. I would add that this is nothing – nothing! – like Pixar. These simplifications mean that the more complex software packages that Steve had the option to use were unnecessarily complex, full of options that he would never use, and the simple, open-source openGL was perfectly sufficient.

The process of writing Steamshovel wasn’t just one-man job (even though I only talked to one person for this post). Steve solicited, and received, ideas for features from all over the collaboration. I personally remember that when he started working here, he took the diligent and kind step of sitting and talking to several of us while we used the old event viewer, just to see what the workflow was like, the good parts and the bad. One particularly collaborative sub-project started when one IceCube grad student, Jakob, had the clever idea of displaying Monte Carlo true Cherenkov cones. We know where the simulated light emissions are, and how the light travels through the ice – could we display the light cone arriving at the detector modules and see whether a particular hit occurred at the same time? Putting together the code to make this happen involved several people (mainly Jakob and Steve), and wouldn’t have been possible coding in isolation.

Visual Cortex Processing

The moment that best captured the purpose of a good event viewer, Steve says, was when he animated an event for the first time. Specifically, he made the observed phototube pulses disappear as the charge died away, letting him see what happens on a phototube after the first signal. Animating the signal pulses made the afterpulsing “blindingly obvious.”

We know, on an intellectual level, that phototubes display afterpulsing, and it’s especially strong and likely after a strong signal pulse. But there’s a difference between knowing, intellectually, that a certain fraction of pulses will produce afterpulses and seeing those afterpulses displayed. We process information very differently if we can see it directly than if we have to construct a model in our heads based on interpreting numbers, or even graphs. An animation connects more deeply to our intuition and natural instinctive processes.

As Steve put it: “It brings to sharp relief something you only knew about in sort of a complex, long thought out way. The cool thing about visualization is that you can get things onto a screen that your brain will notice pre-cognitively; you don’t even have to consciously think to distinguish between a red square and a blue square. So even if you know that two things are different, from having looked carefully through the math, if you see those things in a picture, the difference jumps out without you even having to think about it. Your visual cortex does the work for you. […] That was one of the coolest moments for me, when these people who understood the physics in a deep way nonetheless were able to get new insights on it just by seeing the data displayed in a new way. ”

And that’s why need event viewers.

Share

Since deciding to become a high energy physicist I’ve had a much harder time answering a question often asked of scientists, “What’s the practical application.”  After all, High Energy Physics is, for the most part, a basic science; meaning its long term goals are to increase our understanding of the natural world.  Whereas in applied science (such as hydrogen fuel cell research) there is usually a targeted application from the get go (i.e. hydrogen powered automobiles).

When asked what’s the practical application of my research, I have a tough time answering.  After all, I study experimental Quantum Chromodynamics; and a “practical application” such as the light bulb (application of electromagnetism) or the transistor (quantum mechanics) may not arise in my lifetime.  But what I can say is the technologies developed to perform my research have a noticeable impact on our society (much like the benefits of the Space Program).

I thought today it might be interesting to talk about one such technology….namely the software used by high energy physicists.

Now each experiment at the LHC has its own unique software and computing environment (this is by design).  I can’t speak for the other experiments, but researchers within the CMS Collaboration have created something called CMSSW (or the CMS Software frameWork).  This software framework uses C++ plugins in a python based environment to analyze all experimental data taken by the CMS detector, and all simulated data created by the CMS Collaboration.  However, to use CMSSW (and the software of the other LHC experiments) you must be a member of the collaboration.

But rather then discussing CMSSW, I would like to discuss something common to all LHC experiments (and available to the general public), ROOT.  It is this “practical application” that I’d like to bring your attention to.

(Readers less experienced with programming languages may want to see the “Coding” section of one of my older posts for some background info).

 

What is ROOT?

ROOT is a object oriented software framework that uses a C++ interpreter to write scripts/macros for data analysis.  There are many pre-defined classes and methods available in ROOT; these are designed to enable a user to quickly & efficiently access large amounts of data, and perform analysis.  ROOT has both a command line interface and a graphical user interface, so modifications can be made either “on the fly” or by re-running a script/macro.

ROOT is very powerful, and it is possible to incorporate other libraries (such as the C++ Standard Template Library & others) into ROOT scripts/programs.

But, programming jargon aside, what can you actually do with ROOT?  Simple answer: lots.

ROOT is perfect for creating graphics, namely graphs & plots of interesting data.  But it can also be used to perform more useful tasks, such as numeric integration or differentiation.  ROOT also has several aspects from linear algebra built in (so you can do matrix multiplication/addition with it).  ROOT even enables a user to perform high level custom curve fits.

In fact, in some ways ROOT is very similar to programs like Mathematica & MATLAB.

However, ROOT has a distinct advantage over these products, its free.  ROOT can be downloaded by anyone; and has a rather detailed User’s Guide, and set of Tutorials/HowTo’s that can show new users how to perform a specific task.

But, enough boasting, let’s show some examples so you can get a feel for what ROOT can do!  I’m going to show some simple commands and their outputs, if you’d like to try them out yourself feel free.  My goal with this post is to get you interested in ROOT, not necessarily show you how to use it (guides such as that already exist! See links above!).

 

Example: Visualization

Suppose I was interested in observing the jet topology (or how the jets appear in space) in a particular proton-proton collision event.  There are several ways I could do this.  The first of which is to make what’s called a “lego plot.”  In a lego plot, I place the jet in space based on its angular coordinates; the polar angle, θ, and the azimuthal angle, Φ; and then each point is “weighted” by its momentum component in the xy-plane (termed pT).  To see how these angles & the xy-plane are defined in CMS, see the diagram below:

 

But in high energy physics θ is not very useful; instead we use a related variable called η, which is proportional to θ (η = 0 is still on the positive z-axis).

So in a lego plot I take all the jets in my event, and I plot them by their eta & phi values.  This is very simple to do in ROOT, and for this task I’m going to make a two dimensional histogram:

TH2D *LegoPlot = new TH2D(“LegoPlot”,”Jet Topology”);

LegoPlot->Fill( Jet.eta(), Jet.phi(), Jet.pt() );

Where the first line creates an instance of a two dimensional histogram object, and the second line stores the jet’s η, Φ, & pT as an (x,y) point; but let’s call this an (η,φ) point instead.  This is literally all I need to type.  Of course this is just for one jet, I could put the second line within a loop structure so that I could enter all my jets in my event.

To visualize this output, I simply need to type:

LegoPlot->Draw(“lego2”);

Where “lego2” is an option of the Draw command.  The output of this command is then:

 

Three Jet Event in CMS

 

Here ROOT will automatically open up a new window, and draw the plot for us…it even gave us some statistics regarding the plot (upper right corner).

And all this was done with one line of code!

But, unfortunately the plot isn’t labeled, so we can’t make sense of it quiet yet.  We could use the graphical interface to add a label, or we can use the command line approach.  The GUI is great, but if I have to make this plot over and over again from multiple data files; I’m going to get really tired of using the GUI each time.  So instead, I could use the command line interface and write a script to have ROOT do this for me.  The commands I would use are:

LegoPlot->SetXTitle(“#eta”);

LegoPlot->SetYTitle(“#phi (Radians)”);

LegoPlot->SetZTitle(“p_{T} (GeV/c)”);

Then upon running my script ROOT would automatically add these titles to the plot.

The use of “#” signs in the above lines let ROOT know that I don’t just want the axis to say “eta” but that I want the axis to display the symbol “η.”  The underscore with the {} brackets inform ROOT that I also want a subscript (superscripts are done with ^{ …text….} ).  So with a few lines of code in the ROOT framework I have not only stored data, but shown it graphically.

I never had to compile anything, and I didn’t need to spend time building my GUI!

The final plot result is shown here:

Thee Jet Event in CMS, with Labels!

 

But this η-Φ plot really hasn’t helped me visualize the jets in 3D; after all CMS is a giant Cylinder.  The above plot would be if I took a pair of scissors to the cylinder (starting at the x-axis) and cut down a line parallel to the z-axis.  This would then “un-roll” the cylinder into the flat plane above.

But what if I wanted to view this plot in actual “η-Φ” space?  Well ROOT can do that too, and in one line of code!

LegoPlot->Draw(“psrlego2”)

The familiar “lego2” is still there, but now I’ve added “psr” to the options of the draw command.  ROOT understands psr to mean 3D pseudorapidity coordinates.  The output of this options is shown here:

 

Three Jet Event in CMS, in eta-phi space
Again, in a simple command I’ve been able to do some very intense plotting.  Of course these are just a few brief examples.  I am by no means trying to give an all inclusive guide to how to use ROOT.  As I’ve mentioned, those already exist (see the user’s guide, tutorials & how-to’s I’ve linked above).
d
c

Example: Curve Fitting

I think one of the most challenging things in all of science is curve fitting.  The reason I believe it is challenging is two-fold: first, you have to know what kind of curves would do well in describing your data; second, curve-fitting software is usually very expensive!

However, as I mentioned, ROOT is free!  And can perform very powerful curve-fitting techniques very simply.

Suppose I’ve made a histogram of an observable, and kept track of the number of counts per each value of my observable (this is my measurement).  Let’s say it looks like this:

 

Example Measurement

Now let’s say I’m interested fitting a curve to this data.  Ordinary office programs such as Open Office Spreadsheet or Microsoft Excel have the ability to do simple fits such as polynomials, or simple exponentials.  But beyond a correlation coefficient, I’m not going to get much out of a fit from one of those programs.  I also don’t really get much functionality from them either.

Let me elaborate further on that part.  The above graph, it has a horizontal asymptote at one.  Let’s say I want to incorporate this behavior into my fit.  Well I happen to know that the function:

Has this asymptotic behavior.  This is a relatively simple function, but I couldn’t use the “out-of-the box” Microsoft Excel for this fit.

But, the above function is just to simplistic, it doesn’t allow for any “shifts” or changes in the data from that expression.  Instead, let’s toss in a few parameters, called A & B; these parameters will give us some more flexibility in our fitting procedure:

This is again simplistic, but staying simple is usually a good rule of thumb in science.

But we’ve settled on a function to fit to our data.  How do we implement it in ROOT?  Again, it is very simplistic, we use the function class already available in the ROOT framework:

TF1 *func = new TF1(“func”,”1.0 – exp( [0] * x [1] )”, 0, 40);

Here, I’ve setup a new function.  The first word in quotes is my function’s name, “func.”  The second set of quotes is the mathematical expression I want the function to use; with [0] and [1] being our parameters A & B.  Then the last two numbers are the range of the x-variable that the function will be defined for.

This should immediately illustrate the power of ROOT.  In one line, I can tell ROOT symbolically what mathematical expression I want it to use for fitting.  I can construct any function imaginable, with any number of parameters, just by typing it out to ROOT.  ROOT will even recognize trigonometric functions, along with others.  I can even construct numeric functions (but this takes more code then one line).

Now to perform the fit I just tell the histogram above (call it “Histo”) that I want to fit a function to it.  This is done by:

Histo->Fit(func,””,””,3,40);

The quotes in the above expression tell ROOT how to perform the fit.  Right now there’s nothing in the quotes, so ROOT will just use its default fitting method (chi-squared minimization), in the range of x equals 3 to 40.

Executing this command causes ROOT to perform the fit and spit back the values for my parameters A & B along with their errors:

 

Fit Output

Here the parameters [0] and [1] are labeled as “p0” and “p1.”  There is a column for their values (“VALUE”), and a column for their errors (“ERROR”).  Up at the top I can see that the fit converged, and that ROOT took 86 attempts/iterations in its fitting process.

The “Histo->Fit….” command will also plot the original histogram with the fit overlaid, as shown here:

 

Result of Fit

 

ROOT has also the fit parameters in the statistics box.  From the Χ2/ndf we see that the fit wasn’t a very good fit mathematically; but we weren’t really trying here either.  With a better fit function, and selecting a more advanced fitting procedure we can get Χ2/ndf ~ 1.0 (exactly what we want to have!).

 

In Closing

My goal with this post was to illustrate a product that has come about because of High Energy Physics research, and show that it could be beneficial for the rest of society.  Hopefully this will spark your interest in ROOT for science/engineering/mathematics applications.  There is an extensive ROOT community and support system that you may turn to if you decide to learn ROOT and encounter problems/questions.

I would highly recommend ROOT for any of our readers who are students with technical majors (at all levels).

 

Until next time,

-Brian

Share

Getting Ready

Wednesday, September 3rd, 2008

I’m usually fairly reserved about my enthusiasm, but I have to admit that now even I am getting excited about first beam.

The ATLAS pixel detector is up and running in the pit, and I’ve been working hard this week on looking at the data from calibration scans. Since I wrote a lot of the tools for looking at large quantities of pixel calibration data in a systematic way, I’m the most up-to-speed on using them; and since we have to be calibrated and ready to run very soon, there’s a lot of demand for those skills. Being useful, and having a lot to do, makes me happy. I get up early in the morning ready to come to work, and leave only reluctantly in the evening when I’m too tired to get anything done.

I’ve also been trying hard to get all the training I need to run pixel detector shifts, and it looks like my efforts have borne fruit. I have “training shifts” on Friday and Monday, and hopefully after that I’ll be able to do things on my own. The only downside is that the day shifts now start at 7 AM—it’s a good thing I’ve been getting up early ready to come to work!

Share

Teaching an Old Dog New Tricks

Wednesday, August 6th, 2008

I’ve been slacking. I don’t think I have written anything for more than a month for sure. I can’t blame being busy, because that is the usual excuse. I can find time to write a short blog, I think. But it has been, well, my car was in an accident and was totaled (no one hurt, thank goodness), we’ve had several shorter versions of the CRuZeT described in earlier posts, and we’ve been off to Montreux for a day of jazz, and somehow training for a marathon (the next run is 20 miles long) in September. By the end of the day, I’m shot, and honestly, uninspired. But today, I was at least inspired.

I am reaching a point now, after working on hardware for nearly 6 years straight, that I have to actually begin to look again at what comes out of it with software tools. The problem is, the tools have changed. Not necessarily for the better, but they have changed, and this dog has a few new tricks to learn.

One of these tools is an analysis and graphing package, that I need to use to turn columns of numbers, for example, into a graph to get something I can look at use to make a decision, for example, on timing. It is called ROOT which I think stands for R(?) Object Oriented Tool or something like that. I don’t really know. I was raised on PAW (Physics Analysis Workstation), another analysis and graphing package, based on FORTRAN, and got my paws wet with that. I spent years working with PAW, and now I have to switch to ROOT. I am basically learning by reading web pages and such, but for awhile today it completely flummoxed me. But finally, I got it, and I have to admit, some things are better, like the C++ like programming. But I still have a long ways to go…

The other I am trying to figure out is our CMSSW (CMS Soft Ware) package. I can now get it to run for me and produce some output that might be useful, but I needed lots of hand-holding to do that. Slowly I am getting it, but I am not yet ready to change the base code. I’ll leave that to the Graduate Students for awhile longer. I’m liable to throw a monkey wrench into the works.

Now I am going to work on a talk with another tool that seems to have changed significantly in its new release, ugh. Thanks Mr. Gates.

Share

Event Viewing

Thursday, May 22nd, 2008

Being able to visualize events in the detector is critical to understanding whether everything is functioning properly. But creating a program to display events in practice is incredibly difficult. I have the utmost respect for people who attempt it.

Obviously the big hurdle to event viewing is trying to display a three-dimensional detector on a two-dimensional screen. ATLAS has two solutions to this. One is Atlantis, the tried-and-true event viewer. The philosophy of Atlantis is to try and present the ATLAS detector in every two-dimensional slice possible. Such as this picture here.

Atlantis Event Viewer

From top left going clockwise, you see the full detector as if you were looking down the beam pipe, then the same cross section zoomed in on the calorimeters, then again the same cross section showing the inner detector, then a ‘bird’s eye’ view looking down on the beam pipe, and lastly a side profile of the detector (where the beam pipe is now the horizontal plane).

Atlantis as a tool is very useful but as for style… hmmm, not so much. It does have that retro look and while retro in fashion is considered acceptable, retro in computing is generally not.

Our second option is Visual Point 1 or VP1. VP1 takes the opposite approach. Going totally 3-dimensional, allowing the users to to place himself/herself at any point in the detector. In this picture, the view point is outside the calorimeter.

Atlas VP1 Viewer

The detector is just a shadow, barely seen in the picture and only the hits are shown (in yellow here). While VP1 definitely has that more modern feel, the jury is still out for me. It kind of reminds me of Tron. And it is too touchy. You accidentally hold the mouse button down too long and you are transported to some strange view point. And then you have no idea where you are, or what you are looking at.

It is a thankless job that is for sure!

Share

Wrestling with the Grid

Monday, May 5th, 2008

This being my first entry, I suppose I ought to start with what I do and put it into context—you’ll have to bear with me, because this will take a minute. First, the preliminaries: I’m a fourth-year graduate student at the University of California, Berkeley, working on ATLAS and currently based at CERN in Geneva, Switzerland. I work primarily on testing the offline software.

If you’re a regular reader of the US/LHC Blogs, then that last paragraph made sense to you, except maybe for the last two words. Offline software is the part of the experiment you hear the least about, probably because it’s the hardest to explain. It’s the collection of programs and tools that connects the information that comes from the detector to the physics we’re really interested in, so you have to know about both those things to see what it’s all about. Fortunately, from what I can tell, if you’re a regular reader here then you’re in pretty good hands already as far as the physics and the detector go. So let me give an extremely brief survey the software challenges faced by the ATLAS detector, and then connect it to my own work at the end.

The first and most daunting computing challenge faced by the ATLAS detector is the vast discrepancy between the 40,000,000 potential collisions per second and the 100 or so events that can be stored permanently during that second; this job that is handled by the trigger system, which looks for the collisions that will be most interesting for the physics that we want to do. The first part of this system is entirely hardware-based, but the higher levels run on Linux farms. The data for all this processing has already passed through hardware, firmware, and low-level software, both on and off the detector. So there’s a lot of software involved just in turning the signals on the detector and storing them.

None of that is the offline software, though—that comes in after the data has already been stored on tape, and is less urgent in the sense that it will be done within days rather than within seconds. One of the major tasks of the offline software is reconstruction, which is the conversion of stored information from the detector into the real particles that most likely created those signals. (This has already been done, quickly, by the trigger, but is now done with more precision.) For example, the software might combine a series of “hits” in the Inner Detector to make the likely track of a charged particle, then combine this with energy deposited in the electromagnetic calorimeter to identify a possible electron. (Monica has a written more on how information is combined to identify various particles in this entry.) Offline software is also used to simulate the physics of the detector, which is useful now so that we can “practice” our analyses for when the data is ready, and will be useful later in comparing what we actually see in the detector to what we would expect if the Standard Model were exactly right.

The ATLAS detector is going to record a lot of data, and reconstruction and other offline software tasks take a lot of computer time. Where are all these computers? Well, it turns out that no laboratory in the world has anywhere near enough computing power to do the job, so we link them all together in something called the Grid. This collection of computing sites, spread throughout the world, will have the data recorded by the experiment divided between them; when a physicist wants to look at the data, the job is sent to where the data is, which is much more efficient than copying the data to the physicist’s computer (if she even had enough space, which she probably doesn’t). Of course, using this complicated system presents new challenges; a big one is that the job you run could be sent anywhere, so it’s a lot harder to call tech support if the job fails for some reason. In order to deal with this problem, the ATLAS offline software includes job transforms, which are essentially wrappers for our regular software; whereas normally our jobs are configured by python scripts, the transforms take a very limited number of inputs. This lets us be sure that we’re running the job in a “standard” configuration that can be expected to work, so that the Grid’s computer resources can be used efficiently.

Of course, things can still go wrong, and this—at last!—is where I start to come into the picture. Although the experiment’s software developers always test their changes against the latest version of the code, there are several kinds of bugs they can’t catch, including: 1) bugs that only happen in very large jobs, 2) bugs arising because two developers have made incompatible changes at the same time, and 3) bugs that appear only when multiple stages of data processing (e.g. simulation, then reconstruction) are run. This means that we might produce a software release in which one of the “standard” job transform configurations, which should work, actually doesn’t; if we send such jobs to hundreds of machines around the world, and they all crash in parallel, that’s a big waste of time and money! One of the tools we have to guard against this is the Full Chain Test, which I have written and maintained over the last year or so. This is a set of scripts which send a series of large jobs to a few dedicated four-processor machines here at CERN, to make sure as well as possible that everything is working the way we expect before we send things off into the Grid.

So the short version is this: I write programs to run other programs and make sure they work. Or, as I often tell my friends and family, I sit in front of the computer all day.

Seriously, although this is not the most glamorous work, I’m very happy with the project, because:

  1. It’s important. It’s used routinely by the experiment’s software management to ensure that our releases our good, and it actually finds problems that save us time—which means that I’m helping make sure that our offline software is ready to run when the detector is.
  2. It’s self-contained. I have a specific set of things to be tested, but the details of the implementation have been mostly up to me, so I’ve learned quite a bit.
  3. It’s done, except for a bit of documentation, and hopefully I’ll be able to pass on the routine maintenance to someone else.

That last item is especially important, because as a student I have a lot to learn and only so much time—and a wise professor once told me that once I get good at something, that means it’s time to move on. I have a thesis project to prepare for, and I’m hoping to work more directly with the detector once the commissioning of the Pixel Detector starts in earnest. But more on those things later.

Share