• John
  • Felde
  • University of Maryland
  • USA

Latest Posts

  • James
  • Doherty
  • Open University
  • United Kingdom

Latest Posts

  • Andrea
  • Signori
  • Nikhef
  • Netherlands

Latest Posts

  • CERN
  • Geneva
  • Switzerland

Latest Posts

  • Aidan
  • Randle-Conde
  • Université Libre de Bruxelles
  • Belgium

Latest Posts

  • Vancouver, BC
  • Canada

Latest Posts

  • Laura
  • Gladstone
  • MIT
  • USA

Latest Posts

  • Steven
  • Goldfarb
  • University of Michigan

Latest Posts

  • Fermilab
  • Batavia, IL
  • USA

Latest Posts

  • Seth
  • Zenz
  • Imperial College London
  • UK

Latest Posts

  • Nhan
  • Tran
  • Fermilab
  • USA

Latest Posts

  • Alex
  • Millar
  • University of Melbourne
  • Australia

Latest Posts

  • Ken
  • Bloom
  • USA

Latest Posts

Aidan Randle-Conde | Université Libre de Bruxelles | Belgium

View Blog | Read Bio

Why we shouldn’t combine Higgs searches across experiments

Right now both the ATLAS and CMS experiments are working around the clock to get results ready for the upcoming International Conference on High Energy Physics (ICHEP). What happens when we have a big conference around the corner? We try to analyze as much of the data we have, of course! With all this pressure to get as much out of the data as possible it’s tempting to move too quickly and do what we can to get a discovery, but now is not the time to rush things.

A typical Higgs-like event at CMS (CMS experiment)

A typical Higgs-like event at CMS (CMS experiment)

In order to declare a new discovery we need to have a 5 sigma excess (see this post to explain what we mean by sigma) and projecting our sensitivities using results from the 2011 data, to the data we have accumulated so far suggests that either experiment might see something close to 5 sigma at ICHEP. In this scenario there is an option to combine Higgs searches in order to increase the sensitivity of the datasets even further. This is already what each experiment does for the different final states, and since each experiment understands their detectors and the correlations between the measurements this is the best way to get the most from the datasets.

So if neither experiment gets 5 sigma, and we would like a discovery, what can be done? The next obvious step would be to combine the results from the two experiments and count the sigma. Despite being an obvious next step, this is the worst thing we could do at the moment. The Higgs field was postulated nearly 50 years ago, the LHC was proposed about 30 years ago, the experiments have been in design and development for about 20 years, and we’ve been taking data for about 18 months. Rushing to get a result a few weeks early is an act of impatience and frustration, and we should resist the temptation to get an answer now. Providing good quality physics results is more important than getting an answer we want.

The status of the ATLAS exclusion with 2011 data.  Now our focus is on the remaining space. (ATLAS experiment)

The status of the ATLAS exclusion with 2011 data. Now our focus is on the remaining space. (ATLAS experiment)

The reason we have two experiments at the LHC looking for the Higgs boson is because if one experiment makes a discovery then the other experiment can confirm or refute the discovery. This is why we have both D0 and CDF, both Belle and BaBar, both ATLAS and CMS, both UA2 and UA1 (where in the interest of fairness the orders the names are chosen at random.) Usually these pairs of experiments are neck and neck on any discovery or measurement, so when one experiment sees an effect but its counterpart doesn’t then it’s likely due to a problem with the analysis. Finding such a problem does not indicate poor scientific practice or incompetence, in fact it’s part of the scientific method to catch these little hiccups. (A good example is the dijet anomaly that CDF saw last year. In an experiment as complicated as CDF it’s not unsurprising that something subtle would get missed. Everything that the CDF hardware and software was telling the physicists was there was a bump in their distribution. The easiest way to see if this is wrong is to see what the D0 hardware and software tell us. It turns out they disagreed in this instance and we got the crosscheck we needed.)

If we combine measurements from two different experiments we end up losing the vital crosscheck. The best way to proceed is two wait a few more weeks until both experiments can produce a 5 sigma discovery and see if these results agree. If the results are consistent then we celebrate victory! So let’s resist the temptation to get too excited about combined results between experiments. If we wait a few more months the discovery will be all the sweeter. Then again we may get lucky at ICHEP!

Related articles:
What next for the Higgs?
Higgs Update CIPANP 2012
December 2011 Higgs Seminar liveblog


Tags: , , ,

  • Pingback: [blog post] Why we shouldn’t combine Higgs searches across experiments « [email protected]()

  • Mike S

    One thing I am not clear on from the plots – how sharp should the expected peak be, taking into account smoothing/bucketing in the data and plotting? When do you starting thinking two Higgs, rather than one?

  • Hi Mike, good question! The width of a Standard Model Higgs boson is predicted by the theory, and at 125GeV it’s about a few percent of a GeV, much finer than our resolution. We can extract information about a particle in many ways at the LHC- we can perform angular analyses to find the spin of a new object(s) and whether more than one spin structure exists, we can look at the distribution of final states to see if there is more than one state there and so on. However, to measure the mass of a particle precisely enough to show that there is only one particle there and not more we need a precision machine such as the TeV scale linear collider (either ILC or CLIC). If we have a discovery with the current 2010-2012 datasets then the rest of 2012 will be spent trying to measure the properties of the new particle to determine if it is a Standard Model Higgs boson or something else.

    To move back to the question about the width of the peak, we expect a few tens of MeV, but our resolution is limited by the machine and also by what control samples we can use in the data. The Z boson has a mass of about 90Gev, so it’s a good way to calibrate our resolution studies at high mass. Unfortunately it has a natural width of about 2GeV, so we have to “unfold” the resolution from the natural width, and that’s a tricky process. If we can do that, then we stand a good chance of understanding our resolution and we’ll have smaller uncertainties on our resolution, which will lead to finer results. However, this will not be fine enough, we will still need a TeV scale linear collider.

  • Mike S

    Thanks, so a ~10Gev “bump” is quite reasonable.

  • Pingback: Higgs Update | Not Even Wrong()

  • Yide

    First of all thank you for this great post!

    I asked this on a fellow blogger, Pauline’s page, but perhaps she is too busy to approve the question. I truly hope you can answer my questions though.

    Im hoping you can take some time to explain some things to me as a layperson. There first is, this emphasis on using events that are several deviations away from expected value to prove the existence of the higgs, doesn’t it bake in an assumption that the distribution on the errors or events (or is it the manner in which interactions combine?) is gaussian? What if the distribution is heavy tailed? I’ve never seen it explained why the focus on n-sigma, for any n, is a good idea.

    The second is that if the simulations differ from the real data and it is corrected till it is reliable –

    Isn’t that sort of like begging the question? The more experiments you do the more you fit the model to the data and the more biased the model becomes, while the model is still tied to the properties you are searching for. Clearly I am missing something here. And I’m hoping you can help answer them, I’m numerate enough to be able to handle a high level of math detail so please feel free to not simplify if it takes too much energy.


  • Hi!
    I’m a little bit puzzled by this statement you make in an answer to a comment:
    “However, to measure the mass of a particle precisely enough to show that there is only one particle there and not more we need a precision machine such as the TeV scale linear collider (either ILC or CLIC).”
    If neither ATLAS nor CMS can be sure that the bump is due to just one particle, what type of discovery could be claimed?
    Thank you.

    Dr. Cinnamon

  • Hi Yide, thanks for the questions! (I would not be surprised if Pauline is on a very long flight to Australia to attend ICHEP.)

    It’s important to understand that no individual event is several deviations away from the Standard Model expectation, we only talk about sigmas for when we have many events. At the moment we have a mass spectrum that matches our Standard Model background (assuming no Higgs boson) except for one region. That region is around 125GeV and in December we saw a bump that was nearly 3 sigma away from the Higgsless Standard Model background. I wrote a post about sigmas and what they tell us which might be useful (http://www.quantumdiaries.org/2012/05/09/a-sigma-here-a-sigma-there/) Even though we talk about sigmas, we do not generally assume distributions are Gaussian, it’s just some nomenclature we got stuck with from many years ago. If you prefer you can use probabilities (or p values) to describe results, and that term is much more accurate in its description.

    You’re quite right when you say that some distributions will be tailed. The probability density function for the number of particles in a given bin in a histogram is necessarily Poisson, leading to a large positive tail, and a cut-off at 0 entries. Other examples include the natural lineshape of the Higgs boson itself (which I think is a relativistic Breit-Wigner) which then gets convoluted with other functions that smear out the shape. According to the Central Limit Theorem most of these distributions can be approximated with a Gaussian when we have high statistics samples, so in some cases we can use Gaussian distributions without much bias. You’re right to be skeptical, and so are the physicists at the LHC experiments! These kinds of questions are debated in the internal review process, often at great length. Part of our error-correcting procedure comes from these discussions, and part of it is the answer your next questions. To address your first question explicitly, the shape of the mass peak for a Higgs will be biased and unGaussian. If we take the diphpton final state as an example, we will observe that the photons will tend to lose a little energy as they pass through the calorimeter, leading to an asymmetric peak (they can’t gain energy from anywhere, only lose it.) This effect is more pronounced for the Z peak when it decays to two electrons- there is a very long radiative tail. These differences are built into all our models, so we should be okay.

    For your second question, this is quite a subtle point! Our models reflect our best understanding of the physics processes, and this keeps changing as we get more data. (Sometimes our simulations are just plain wrong, but these mistakes are usually found and corrected very quickly.) Our best understanding is never perfect, so we tweak our simulations to match the data. If we’re not careful we can get ourselves caught in a circle here, because if we match the simulation to the data and then look for differences between the simulation and the data we’ll see nothing. What we do is use control samples, which are statistically independent, and far from the signal regions. For example, if we want to see what the rate of background events for Higgs decaying to two photons is like, we can look in the regions where we know there is no Higgs boson, and then normalize the simulation that way. Generally, we take a mundane, high statistics and well understood region of some space, make the simulation agree with the data there, and then extrapolate to somewhere more interesting. It’s an old method and it usually works very well. Whenever we make a change like this we have to include the systematic uncertainty in our final result, so we try to minimize the number of corrections we make. The point at which we stop making corrections is a bit arbitrary, and these kinds of discussions can take a very long time to resolve. (I remember hearing of one correction that took over a year to complete!)

    I hope that answers your questions. If not, feel free to ask again!

  • Hi Dr Cinnamon, thanks for your question! Unfortunately, the best we can ever hope for is consistency with a particular model, and not unambiguous discovery. When the talks are given and the papers are submitted the titles will probably say “Evidence for an excess” or something similar. All we can do is show that a bump exists which is inconsistent with the Higgsless Standard Model background (ie 1 in 1,000,000 chance of occurring) and that its properties such as spin and branching fractions are consistent with the Standard Model predictions. Beyond that we can’t tell for sure whether it is just one particle, or more, without a more precise machine. So the best we could hope for would be “Observation of an excess consistent with a Standard Model Higgs boson”. Once that happens the theorists will write papers about their favorite models and they will provide stringent tests to look for new particles.

    (On a side note, a very interesting example of two very similar particles that turned out to be different are the KLong and KShort. These two particles have nearly the same mass, nearly the same quark content and they both decay to similar lighter particles. One of them has a longer lifetime, the KLong, and for a long time we thought this was the only difference between them. It turns out that their masses differ by 1 part on 10^15, which is far too small to measure directly. Instead, it was inferred from precision measurements of their mixing properties. Someone will find a way to separate out multiple states in the same mass region!)

  • I can see the value of this as a PR move (minimize risk of retraction) but from a scientific standpoint it seems like the optimal thing to do.

    Obviously, if both detectors suffer from no systematic errors doing the meta-analysis is no worse than analyzing the results from ATLAS on both even and odd days.

    Presumably, the worry is that one of the detectors has a systematic error that causes it to report events that look like the Higgs. Alright, that’s a compelling argument if when the detectors came online there was only one possible energy region in which to find the Higgs. Our prior probability that one detector would malfunction in exactly the way which would produce evidence for the Higgs that agrees with that seen at the other detector.

    I’m not going to work through all the probability but heuristically consider this:

    The probability that both detectors have a systematic error causing them to falsely gather compatible (Same energy etc..) evidence for a Higgs particle should be higher than the probability that one detector works correctly and the other has systematic error that causes it to produce results consistent with the first detector. After all, no matter how much you isolate the teams building the detectors they rely on all sorts of common knowledge in the physics community and assumptions about how things will work. These could easily be incorrect and generate two detectors with the same systematic bias.

    On the other hand if one detector works right it’s nothing more than amazing luck that the second detector would have a systematic bias causing it to produce data consistent with the first detector.

    Since a false positive from a shared faulty assumption used in construction is a much more likely problem and you are willing to release a result after both detectors seem to verify the Higgs the extra risk of a super unlucky mishap of a independently generated bias in one detector producing consistent data with a working detector seems trivial.

  • Hi Peter, thanks for the comment! I disagree with what you say for the following reasons. You state

    The probability that both detectors have a systematic error causing them to falsely gather compatible (Same energy etc..) evidence for a Higgs particle should be higher than the probability that one detector works correctly and the other has systematic error that causes it to produce results consistent with the first detector.

    I don’t see a justification for this statement. The two detectors use very different technology and are of significantly different size. The hardware for each was custom designed and built and while they may have used some of the same suppliers or experts, the chance that a malfunction would manifest in the same way in both detectors is minimal. (Experience shows that even within a single detector the various components of the same type require different corrections and calibrations.) Incidentally, the nice thing about a mass peak is that it doesn’t correlate directly to an energy. The mass peak should be seen in events where particles have sufficient energy to create it, so the energy of the collider is to some extent irrelevant.

    To the extent that similar experiments have mistakenly “confirmed” an effect that wasn’t real it is almost always caused by some known physical effect. (A good example would be the pentaquark which was “discovered” and subsequently “undiscovered” in the early 2000s.) In this case having two experiments doesn’t help, but other sanity checks do. (For example, we can check branching fractions, look for other decays which should also exist. In the case of pentaquarks it was found that only fixed target experiments saw the effect, and not colliding beam experiments.)

    The problem with combining results is that it leads to the question “Which is the most likely value for the mass?” If one of the experiments has a bias (eg it uses a trigger which preferentially chooses events with 62GeV photons, and to balance momentum something else at 62GeV must recoil against this photon) which leads to a false peak, and we combine the result with the other experiment which sees a wide but flat excess then the “most likely” region is still where the false peak is, and in fact it can get even more prominent after the combination. Instead we can take the two results and compare them to each other and ask ourselves (even qualitatively) if they are consistent. If they are not then we need to work out why not. If so then we can gain confidence in both results.

    There are certainly arguments on both sides of this debate. You quite rightly point to the situation where there are no systematic uncertainties, but do not point out correlated uncertainties, which are usually very hard to untangle. (In fact, during previous combinations ATLAS and CMS had a “handshake” where they had to agree on things like common parameters used in their fits in order to make sure that these correlations were taken into account.) While I look forward to an eventual combination, after a confirmed discovery from both experiments, I do not see how anyone cannot wait another few weeks or months to get a combination.

  • Yide

    Thank you for taking the time to respond. I was happy to see that the question went through as I was worried I’d been the victim of an overzealous spam filter. It’s really great what all you who communicate are doing. On your own time and doing what appears to be a thankless task, considering the response rates.. but I suspect there are a lot of people quietly appreciating the effort. To be able to get a window on and even answers from top scientists – it makes the world a bit more even, a dint on ignorance. Thanks.

    Okay the questions. You’ve well addressed them both although I am still uneasy with the first. For example, the central limit theorem requires assumptions such as independence, finite variance and additivity of the random variables. But then, the blog post you linked to had numbers so small it probably would take a mistake of incredible proportions for those assumptions to matter. And often, breaking assumptions on independence works out well enough – it’s probably why we can understand the universe at all.

    I hope I don’t appear crankish, I’m not trying to pretend I’m thinking of anything original here. It’s just these subtleties are rarely addressed in popular or even elementary level texts. So the middle of the road person who is neither a mathematical layman nor with any knowledge of physics beyond 1930 is left out cold.

    I have no idea how modern experimental physics is done, it is nothing like what we learn in high school. It sounds like what happens is you use modern theories and experiments to create complex simulations. Then test the model results with experiments to look for deviations. You then use separate probabilistic models to get at the probabilities for such deviations. Do I have this right?

    It is not at all like the straightforward experiment -> data -> theory of old. More like programmers and statisticians – machine learners than physicists. From an outside perspective, contrasted with the clean process of what science is expected to be like and what it really is like (and with no one takes the time to explain the new changes clearly) it is not surprising why one would feel the whole thing is wishy washy and built on shaky ground. Using statistics to get at deviations from computer simulations to characterize the detection statistics of incredibly complex detectors. So much room for error – programming, assumptions, engineering – your jobs must be stressful!

    But I suppose as more tests far outside the model are tried the more confident one can become. Inductive reasoning to the next level! At the same time I hope this whole Higgs thing is not botched. The repercussions would travel far outside physics and grievously harm public sentiment for science, reduce funding for hard science start ups and arm climate and evolutionary deniers.

    As for combining both tests, in machine learning when you train a classifier using different “perspectives” of the same data, the combined classifier does better than either one. So maybe it will be the combo will turn out ok?

    One last question as I know I’ve gone on at length. We keep building larger and larger accelerators – is there no other way to do high energy physics?

  • Bob Rehbock

    I can see the argument but think it is mathematically more than 5 sigma that the Higgs is at this 125 value if we treat the entire data set alone experiment which it can be. It has a cross check greater than 5 sigma.provided each separate part of the experiment is close enough to the other in result that the chance that both results would be by chance is greater than 5 sigma away from the expectation.
    But I prefer to ask what the sigma is to find both lubes and won’t agreeing that this is political ?
    Treating the Lhc as for that purpose we have achieved proof that it is possible for them to agree. We observed that today.

  • Bob Rehbock

    I can see the argument but think it is mathematically more than 5 sigma that the Higgs is at this 125 value.
    But I prefer to ask what the sigma is to find both Lubos and Woit agreeing on something
    Treating the Lhc as for that purpose we have achieved proof that it is possible for them to agree. We observed that today they both say this is litics

  • Since I haven’t worked through the repeated case with continuous probability distributions this will be necessarily vague but I’m still not convinced.

    We agree that if the two detectors have no bias then we get a more accurate probability by combining the results.

    However, presumably, the two detectors each have some prior probability distribution over potential biases they might have. Assuming, as you argue, that these are essentially independent distributions then the chance that the detectors agree in their measurements conditional on them both being biased will be equal to the chance that the detectors agree in their measurements conditional on exactly one of the detectors being biased (I’m being a bit loose here and assuming that the sequence of measurements produced by the detectors can be divided up into equivalence classes of agreeing results and being a bit informal with the prior distribution over potential biases).

    Assuming there are many possible outcomes the chance of a biased detector agreeing with the unbiased detector will be extremely small. Causing even the tiny chance of correlated bias to be the dominate concern…a concern that doesn’t go away by waiting till both machines reach the same conclusion.

    Quite possibly if you work this out using the full statistical models I turn out to be wrong but I just wanted to explain my skepticism about any serious benefit from insisting that both detectors reach the threshold on their own.

    Still I agree with your recommendation. The people who really need to know can look at the data (or hear about it through the grapevine) and decide for themselves how much confidence to give the conclusion. The public is better off waiting in anticipation and until the total probability of error gets even lower.

  • Well we’ll find out on Wednesday! 🙂 Getting Lubos to agree with anyone is always fun. I wrote an article about how to reduce systematic uncertainties and he accused me of hiding results in even bigger uncertainties! I don’t think he read the article at all…

    Anyway, just to be clear, here is why I don’t want a combination: “more than 5 sigma that the Higgs is at this 125 value“. Your statement shows an obvious bias that we all have at the moment, we all want more than 5 sigma at 125GeV It’s so easy for us to get carried away with what we think the answer should be. We mustn’t forget that CMS saw a significant bump at 119GeV.

  • Hi Peter, thanks for your long reply- I love a good discussion in the comments!

    You say “Assuming, as you argue, that these are essentially independent distributions then the chance that the detectors agree in their measurements conditional on them both being biased will be equal to the chance that the detectors agree in their measurements conditional on exactly one of the detectors being biased” but I’m not sure what you mean by “measurement” here. If you mean a measurement of a real process (eg Higgs boson decaying to two photons) then I agree with your statement, but if you mean an observation of an apparent excess then I’d disagree. Since the biases in the detectors are largely independent (they’re not entirely, see below) then the probability that both experiments will see a false bump in the same place is very small.

    Now both experiments are certainly biased, there’s no way to avoid this. At the very least they can only record data that fire the trigger, so they already select a subsample of events that look “interesting”. This can lead to biases very easily in the following way: suppose we have a trigger that fires on photons that have a transverse momentum of 62GeV. The distribution of energies of photons is roughly an exponential decay (this is just thermodynamics and the partitioning of energy states) so above 62GeV the number of photons falls very quickly. Below 62GeV the trigger does not fire as often, so we end up with a sample of events rich in photons with transverse momentum of 62GeV. That’s fine so far, there is an obvious bias that we can take into account. However, to balance momentum some object must travel in the opposite direction in the transverse plane, and occasionally this object will be a photon, or will look like a photon. The invariant mass of these two objects would be 124GeV, and it would appear as a bump in the spectrum. Adding more data would just make the bump bigger, so it could look like a new particle. As long as the other experiment doesn’t have the same trigger then the probabililty that both experiments would see the same fake bump in the same place is very small. (There are many other sources of bias- neither detector has full solid angle coverage, so many events get lost, and the angular distributions have “holes”. We get cosmic ray muons coming in from vertically above that fire triggers. We usually have similar thresholds for energy depositions because neither experiment wants to lose sensitivity where the other keeps it.)

    Now, moving on to correlated biases, this a big problem, and as you say, one that cannot be solved by waiting for more data to come in. This should include experimenter bias to be complete (in my mind the best reason to be in the blind with respect to the other experiment’s results) which is an area where nobody is a fair judge of their own prejudices. The Particle Data Group keeps a collection of historical plots that show how different measurements varied over time, and this demonstrates the point beautifully. People “knew” what their answers should be, and they confirmed their own personal biases in their measurements.

    For correlated hardware biases there may be no sure way to completely eliminate such effects. I referred to the pentaquarks in a previous comment, and I think that this is still the best example. Experiment after experiment “confirmed” excesses of 6 or 7 sigma, and it was only when we had a very different experimental paradigm that it became clear that such a particle did not exist. For something like that Higgs, I suppose the only way we can rule out correlated hardware effects is by observing the decay channels. The Standard Model gives stringent estimates of all the branching ratios, so if we have a bump, but we don’t see the other decay modes then we can be confident that we have not found the Higgs after all. In that scenario we may have found something, or we may have a correlated hardware effect.

    So I think that we can agree that not combining (for the sole purpose of getting 5 sigma) is better than combining right now, but it’s still far from perfect. The methods we have employed are still fallible and we need more careful checks before we declare victory.

  • Hey Yide, thanks for the comment and questions! I’m happy to answer questions from readers- I love the chance to talk to an educated audience, and it makes the job feel a lot more worthwhile. 🙂 Just to put your mind at ease, we screen the comments on here to remove spam and multiply posted comments. I review every comment (even that nasty ones!) and never censor a valid comment. But it takes a while for one of us to notice the comments and approve them.

    To respond to your points in an order that makes sense… From my own experience the treatment of statistics is not as complete as it should be. When I was at high school I optionally took a course in advanced statistics, which went into great detail about topics such as hypothesis testing, Type I and II errors, the central limit theorem, bias, and uncertainty on the variance. These are all essential topics for a practicing scientist, and since I moved into the practical world of physics these tools have been a great help to me. I’m often surprised by how few peers know about these kinds of concepts. For example, many physicists consider the standard deviation to be the “error” of a measurement, and do not realize that in some cases we need to also calculate the uncertainty on the standard deviation. It is, after all, just the result of a moment generating function like the mean, and it has its own calculable uncertainties. Then moving into the field taught me a great deal about how to apply all these tools, including the difference between frequentist and Bayesian interpretations (which is still a contentious and often misunderstood topic.) So I empathize with you that the subtleties of statistical analysis are often brushed under the carpet. Even though I like to think that I know more about statistical analysis than the average physicist, I still have trouble keeping up with the latest developments, and the seminars and classes are not frequent enough. The timelines are often aggressive and the field is competitive. So if you’re crankish, then so am I! We should be crankish.

    I have no idea how modern experimental physics is done, it is nothing like what we learn in high school. It sounds like what happens is you use modern theories and experiments to create complex simulations. Then test the model results with experiments to look for deviations. You then use separate probabilistic models to get at the probabilities for such deviations. Do I have this right?

    Yes, that’s about right. There’s one more essential step- we need to apply corrections and calibrations to the simulation to get things right. We usually try to be as conservative and model independent as possible, so we start with a well defined null hypothesis (“the Standard Model with the previously discovered particles is sufficient to describe all observed phenomena”) and then make a measurement with the existing data in an attempt to falsify this hypothesis. The final step is then the interpretation in terms of a particular model, so if we do it properly we can just swap out the final step for another one and look for some other phenomenon for free, which is quite cool!

    In fact, you get it spot on when you say that we’re programmers and statisticians (and in some cases machine learners.) We collect the data, which is an automated process because humans are not fast enough to record it, and then write the data to disk. That means all the answers we are looking for are in the dataset and they could have been there for months. It’s just a matter of sifting through the events and making the right corrections to the simulation as we do. It’s a long and tedious process, and since the dataset is so large, we have to automate our analysis using computers, so we are programmers. To an extent the complex statistical analyses we perform is an extension of this. We could pass on a graph to a statistician and ask for an interpretation, but instead we developed a whole suite of mathematical tools and perform thousands of test jobs to get the answers we need. (You can see our main analysis framework, which is open source, free and available to the world at http://root.cern.ch) One of my friends, a neuroscientist, once asked me what my experimental paradigm was. I’d never heard of the term, so I just shrugged and told him I didn’t know. I didn’t understand what he was talking about and he didn’t understand why I didn’t understand. It was a strange conversation.

    The job is stressful and we are prone to ideas, but we have ways of dealing with it. (Sometimes the stress is unavoidable, especially this week!) Most of the mistakes come up very often, so an experienced physicist knows what to suggest and how to correct the mistakes. Simple bugs in code are a fact of life, and they nearly always get found eventually. More demanding mistakes are actually more fun to correct, because we get to use the data to do it, and it takes some thought to solve the problem. Two physicists go to the blackboard and talk about their ideas, and usually they walk away with a method to find their answer. (For example, we often try to cancel out uncertainties by finding events that have the same topolgy. If we look for Higgs decaying to two taus we can calibrate our simulation using the Z boson decay to two taus.) There are many more interesting examples we can use for other scenarios, and since this is “real” physics, it’s more fun!

    You can rest assured that by the end of 2012 we’ll know whether or not we have the Standard Model Higgs boson. We’re so close to 5 sigma right now that most people would place money on a discovery in the next few months. The remaining luminosity can then be used to study the branching fractions of the bump, and if we see b quarks, tau leptons, W and Z boson, and photons in the final state then the discovery will not be botched. There are so many crosschecks that we can be confident that a discovery represents the actual Higgs boson, and not some fake particle.

    The combination will be fine after a discovery, and in any case someone will combine the results. My objection is to combine the results in order to get a discovery, and push our results past 5 sigma. It’s very difficult to perform a combination without having a personal prejudice about what the answer “should” be and where the distribution “should” peak. Seeing things from the inside of an experiment is different to seeing it from the outside, and I see CMS’s results as our crosscheck. If we combine our results and then look, we’ll lose that essential check, and so will our colleagues at CMS. Having CMS’s result in a box that we can’t access is great, because we can forge ahead with our analyses and always have a blinded check ready at the end. (In fact, this is similar to how physicists measure the quantity g-2 for the muon. One team measures one quantity, and another measures another quantity, then at the very end they divide one by the other to get the measurement. They can both work unblind, as long as they don’t show their measurements to each other until the very end.) From the outside this isn’t so much of an issue, and the temptation to combine is higher. After all, data are data, and you point out that we can increase our sensitivity by combining. Let’s not put our eggs in one basket just yet, let’s wait a few more weeks!

    To answer your final question, the short answer is “No”. Higher energies mean larger machines, because we need to accelerate the particles to higher energies and that takes larger machines. (Either we use a linear collider, where energy scales roughly with length, or a circular collider, where the energy loss scales roughly with the inverse of the radius of curvature.) We can only get higher energies with smaller machines if we have new technology, and there is a plot that shows accelerator energy vs time, and it’s possible to find new technologies as they develop by looking at the curves on the plot. Finally, we can turn to cosmic rays for some experiments. It’s not easy or pretty, but they are still the highest energy particles we have access to, and that may be where we have to turn in the future. We can also look indirectly at lower energies. The mass of the Higgs boson is constrained by electroweak fits, and the charged Higgs boson has been excluded from certain regions by indirect searches at BaBar and Belle. There are ways to get hints about high energy physics from lower energies, but they’re generally harder and less “impressive” than the real searches, so we need bigger and more powerful machines. Although for the record, the LHC can increase its energy without increasing its size- it’s not working at design energy at the moment!

  • Thank you for your great and well informed post. From my side, let’s take a glance from other striking different vision or paradigm but not to negate, just to complementing :

    Special message on Higgs Boson from Universe with three fabric of
    Nature Knowledge, Matter and Energy

    • Derived from our Nature Knowledge Theory (NKT) supported by Universe with three fabric, Nature Knowledge – Matter – Energy, that generated Human System Biology-based Knowledge Management (HSBKM) model framework as “High End of Universe evolution model” and had been reverse engineered by Inverted Paradigm Method (IPM) , we come up with Higgs Boson metrics with Knowledge Value (KV) measuring 10-38 (Planck Number) and its “k” constant close to or practically zero indicating that Higgs Boson positively beyond domain of E = mc2 ecosystem to get physically as well as statistically proved by LHC – CERN. In other words, it seems impossible to get scientific effort (LHC – CERN) successfully done, albeit in reality, hypothesized Higgs Boson positively believed to exist after thoroughly studied with Nature Knowledge Theory (NKT) approach

    (…to read the full text, goto URL http://mobeeknowledge.ning.com/forum/topics/special-message-on-higgs-boson-from-universe-with-three-fabric-of )

  • Yide

    Thank you very much, you have answered my questions more thoroughly than I could have hoped. I now have a much better idea of how experimental high energy particle physics works. I suppose my discomfort is less the uncertainties and more of if science has gotten too hard!

    As for people putting money on the higgs result – yep, check here =D http://www.intrade.com/v4/markets/contract/?contractId=700242

    What you mention on the dangers of the combination is certainly very worrying, another error vector via hidden biases.. and a reversal on such a hyped piece of news would be tragic. The news on the higgs itself is win win IMO. Either celebrate new physics or celebrate good physics. Seems like there is an undertone of everyone already knowing the combined data set has yielded statistically significant results though.

    Anyways thank you, please don’t let me take any more of your time. You have a historic event in which to participate in =)

  • Pingback: Proof of “God Particle” Found | Not Even Wrong()

  • Pingback: Higgs boson: Web goes wild with speculation – Los Angeles Times | CouponFlow()

  • Pingback: Talking to my Cat Geoffrey About Tom Cruise and Katie Holmes’ Divorce and the Higgs Boson | Modern Primate | man, that's deep()

  • Pingback: Waiting for Godot, Oops, God Particle, Higgs boson « वसुधैव कुटुंबकम()

  • Pingback: Did CERN Find a Higgs ? Well not quite. But they probably found a New Particle ! and extended their funding for years | Cosmology Science © 2011-2012 David Dilworth()

  • Keep thiѕ going pleaѕe, great јob!

  • I nеeded to thаnk you fοr this fantаstic reаd!
    ! I absοlutеly enjoуed еvеry bit
    of it. I have gоt you book marked tο сheсk out new thingѕ you