• John
  • Felde
  • University of Maryland
  • USA

Latest Posts

  • James
  • Doherty
  • Open University
  • United Kingdom

Latest Posts

  • CERN
  • Geneva
  • Switzerland

Latest Posts

  • Aidan
  • Randle-Conde
  • Université Libre de Bruxelles
  • Belgium

Latest Posts

  • Vancouver, BC
  • Canada

Latest Posts

  • Laura
  • Gladstone
  • MIT
  • USA

Latest Posts

  • Steven
  • Goldfarb
  • University of Michigan

Latest Posts

  • Fermilab
  • Batavia, IL
  • USA

Latest Posts

  • Seth
  • Zenz
  • Imperial College London
  • UK

Latest Posts

  • Nhan
  • Tran
  • Fermilab
  • USA

Latest Posts

  • Alex
  • Millar
  • University of Melbourne
  • Australia

Latest Posts

  • Ken
  • Bloom
  • USA

Latest Posts

Posts Tagged ‘pileup’

Physicists did a lot of planning for data analysis before the LHC ever ran, and we’ve put together a huge number of analyses since it started. We’ve already looked for most of the things we’ll ever look for. Of course, many of the things we’ve been looked for haven’t shown up yet; in fact, in many cases including the Higgs, we didn’t expect them to show up yet! We’ll have to repeat the analysis on more data. But that’s got to be easier than it was to collect and analyze the data the first time, right? Well, not necessarily. We always hope it will be easier the second or third time around, but the truth is that updating an analysis is a lot more complicated than just putting more numbers into a spreadsheet.

For starters, every time we add new data, it was collected under different conditions. For example, going from 2011 to 2012, the LHC beam energy will be increasing. The number of collisions per crossing will be larger too, and that means the triggers we use to collect our data are changing too. All our calculations of what the pileup on top of each interesting collision looks like will change. Some of our detectors might work better as we fix glitches, or they might work worse as they are damaged in the course of running. All these details affect the calculations for the analysis and the optimal way to put the data together.

But even if we were running on completely stable conditions, there are other reasons an analysis has to be updated as you collect more data. When you have more events to look at, you might be interested in limiting the events you look at to those you understand best. (In other words, if an analysis was previously limited by statistical uncertainties, as those shrink, you want to get rid of your largest systematic uncertainties.) To get all the power out of the new data you’ve got, you might have to study new classes of events, or get a better understanding of questions where your understanding was “good enough.”

So analyzing LHC data is really an iterative process. Collecting more data is always presenting new challenges and new opportunities that require understanding things better than before. No analysis is ever the same twice.


Can the LHC Run Too Well?

Friday, February 3rd, 2012

For CMS data analysis, winter is a time of multitasking. On the one hand, we are rushing to finish our analyses for the winter conferences in February and March, or to finalize the papers on analyses we presented in December. On the other, we are working to prepare to take data in 2012. Although the final decisions about the LHC running conditions for 2012 haven’t been made yet, we have to be prepared both for an increase in beam energy and an increase in luminosity. For example, the energy might go to 8 TeV center-of-mass, up from last year’s 7. That will make all our events a little more exciting. But it’s the luminosity that determines how many events we get, and thus how much physics we can do in a year. For example, if the Higgs boson exists, the number of Higgs-like events we’ll see will go up, and so will the statistical power with which we can claim to have observed it. If the hints we saw at 125 GeV in December are right, our ability to be sure of its existence this year depends on collecting several times more events in 2012 than we got in 2011.

We’d many more events over 2012 if the LHC simply kept running the way it already was at the end of the year. That’s because for most of the year, the luminosity was increasing over and over as the LHC folks added more proton bunches and focused them better. But we expect that the LHC will do better, starting close to last year’s peak, and then pushing to ever-higher luminosities. The worst-case we are preparing for is perhaps twice as much luminosity as we had at the end of last year.

But wait, why did I say “worst-case”?

Well, actually, it will give us the most interesting events we can get and the best shot at officially finding the Higgs this year. But increased luminosity also gives more events in every bunch crossing, most of which are boring, and most of which get in the way. This makes it a real challenge to prepare for 2012 if you’re working on the trigger, because have to sift quickly through events with more and more extra stuff (called “pileup”). As it happens, that’s exactly what I’m working on.

Let me explain a bit more of the challenge. One of the triggers I’m becoming responsible for is trying to find collisions containing a Higgs decaying to a bottom quark and anti-bottom quark and a W boson decaying to an electron and neutrino. If we just look for an electron — the easiest thing to trigger on — then we get too many events. The easy choice is to ask only for higher-energy electrons, but beyond a certain points we start missing the events we’re looking for! So instead, we ask for the other things in the event: the two jets from the Higgs, and the missing energy from the invisible neutrino. But now, with more and more extra collisions, we have random jets added in, and random fluctuations that contribute to the missing energy. We are more and more likely to get the extra jets and missing energy we ask for even though there isn’t much missing energy or a “Higgs-like” pair of jets in the core event! As a result, the event rate for the trigger we want can become too high.

How do we deal with this? Well, there are a few choices:

1. Increase the amount of momentum required for the electron (again!)
2. Increase the amount of missing energy required
3. Increase the minimum energy of the jets being required
4. Get smarter about how you count jets, by trying to be sure that they come from the main collision rather than one of the extras
5. Check specifically if the jets come from bottom quarks
6. Find some way to allocate more bandwidth to the trigger

There’s a cost for every option. Increasing energies means we lose some events we might have wanted to collect — which means that even though the LHC has produced more Higgs bosons, it’s counterbalanced by us seeing fewer of the ones that were there. Being “smarter” about the jets means more time spent by our trigger processing software on this trigger, when it has lots of other things to look at. Asking for bottom quarks not only takes more processing, it also means the trigger can’t be shared with as many other analyses. And allocating more bandwidth means we’d have to delay processing or cut elsewhere.

And for all the options, there’s simply more work. But we have to deal with the potential for extra collisions as well as we can. In the end, the LHC collecting much more data is really the best-case scenerio.


Lost in Acronym Translation

Thursday, October 13th, 2011

My first impression, once I got myself properly into the CMS databases and joined the requisite forty or so mailing lists, was that CMS has a lot more acronyms than I was used to. Particularly jarring were the mysterious PVT (“Physics Validation Team”) meetings, and the many occurrences of “PU” (“pileup“) always looked to me like “Princeton University” until I realized that made no sense in context.

But then I remembered all the acronyms on ATLAS, and learned that “PU” has gotten more common there too now that the increasing pileup is a frequent subject of discussion. (I really wasn’t paying attention generally to either ATLAS or CMS for the year where I did my analysis and wrote my thesis.) So although the culture of acronym use may be a bit different, it’s really just a matter of translating from one experiment’s terms to another.

For example, I recently learned that a JSON (“JavaScript something something”) file indicates which LumiSections (not an acronym, oddly) are good in a set of runs — in other words, for which times are the recorded data for all parts of CMS in good shape? On ATLAS, it would have been a GRL (“good run list”) indicating which LumiBlocks were good.

I still think that acronyms are thrown around in conversation a bit more on CMS than on ATLAS. Fortunately, there is a public list of CMS acronyms to help me. I’m sure I’ll figure them out eventually.


Imagine you’re in charge of a budget for a large organization of a few thousand people who are experts in their field.  Imagine that if you don’t spend some of the money in the budget that you can’t keep what you’ve saved- it will be lost forever.  Now imagine that there’s another group of a few thousand experts with exactly the same budget, right down the last penny.

That’s the kind of scenario that we face at the LHC, except the budget is in time and not money.  We count proton collisions and not dollars.  The LHC is delivering world record luminosities right now, and the different experiments are getting as much data as they can.  For LHCb and ALICE there is pressure to perform, but between ATLAS and CMS the competition is cut throat.  They’re literally looking at the same protons and racing for the same discoveries.  Any slight advantage one side can get in terms of data is crucial.

What does any of this have to do with my work at ATLAS?  Well I’m one of the trigger rates experts for pileup.  When we take data we can’t record every proton collision, there are simple too many.  Instead, we pick the interesting events out and save those.  To find the interesting events we use the trigger, and we only record events when the trigger fires.  Even when we exclude most of the uninteresting events we still have more data than we can handle!  To get around this problem we have prescales, which is where we only keep a certain fraction of events.  The trigger is composed of a range of trigger lines, which can be independent of one another, and each trigger line has its own prescale.

A high pileup event at ATLAS

High pileup scenarios. Can you count the vertices? (ATLAS Collaboration)

The term “pileup” refers to the number of proton collisions per bunch crossing (roughly how many interactions we can expect to see when we record an event.)  When I came to ATLAS from BaBar I had to get used to a whole new environment and terminology.  The huge lists of trigger lines alone made my head spin, and so far pileup has been the strangest concept I’ve had to deal with.  Why take a scenario that is already overwhelmingly complicated, with one of the most intricate machines the world, and make it even harder to understand, for the sake of a few more events?  Because we’re in competition with CMS, that’s why, and everything counts.  The image on the right shows a typical event with multiple interactions.  Even counting the number of vertices is difficult!

Balancing the different prescales is where things get interesting, because we have to decide how we’re going to prescale each trigger.  We have to make sure that we take as much data as possible, but also that we don’t over-burden our data taking system.  It’s a fine balancing act and it’s hard to predict.  Our choice of trigger prescales is informed by what the physicists want from the dataset, and what range of types of events will maximize our output.  The details of what kinds of events we want is a very hotly debated topic and one that is best left to a separate blog post!  For now, we’ll assume that the physicists can come up with a set of prescales that match the demands of their desired dataset.  What usually happens then is that the trigger menu experts ask what would happen if things were a little different, if we increased or decreased a certain prescale.

The effects of proton burning on luminosity.

The effects of proton burning on luminosity. (LHC)

We need to pick the right times to change the prescales, and it turns out that as we keep taking data, the luminosity decreases because we lose protons when they interact.  This is known as proton burning and you can see the small but noticeable effect of this the image above.  As we burn more protons we can change the prescales to keep the rate of data-taking high, and that’s where my work comes in.  The rates for different trigger lines depend on pileup in different ways, so understanding how they act in different scenarios allows us to change the prescales in just the right way.  We can make our trigger very versatile, picking up the slack by changing prescales on interesting trigger lines, and pushing our systems to the limit.  My job is to investigate the best way to make these predictions, and use the latest data to do this.  The pileup scenarios change quite rapidly, so keeping up to date is a full time job!  And every second spent working on this means more protons have been burned and more collisions have taken place.

It’s not an easy task, it forces me to think about things I’ve never considered before, and keeps the competition at the forefront of my mind.  I knew I’d be in a race for discovery when I joined ATLAS, but I never realized just how intense it would be.  It’s exciting and a little nerve-wracking.  I don’t want to think about how many protons pass by in the time it takes to write a blog post.  Did we record enough of them?  Probably.  Can we do better?  Almost certainly.  There’s always more space in this budget, and always pressure to stretch it that little bit further.