• John
  • Felde
  • University of Maryland
  • USA

Latest Posts

  • USLHC
  • USLHC
  • USA

  • James
  • Doherty
  • Open University
  • United Kingdom

Latest Posts

  • Andrea
  • Signori
  • Nikhef
  • Netherlands

Latest Posts

  • CERN
  • Geneva
  • Switzerland

Latest Posts

  • Aidan
  • Randle-Conde
  • Université Libre de Bruxelles
  • Belgium

Latest Posts

  • TRIUMF
  • Vancouver, BC
  • Canada

Latest Posts

  • Laura
  • Gladstone
  • MIT
  • USA

Latest Posts

  • Steven
  • Goldfarb
  • University of Michigan

Latest Posts

  • Fermilab
  • Batavia, IL
  • USA

Latest Posts

  • Seth
  • Zenz
  • Imperial College London
  • UK

Latest Posts

  • Nhan
  • Tran
  • Fermilab
  • USA

Latest Posts

  • Alex
  • Millar
  • University of Melbourne
  • Australia

Latest Posts

  • Ken
  • Bloom
  • USLHC
  • USA

Latest Posts

Ken Bloom | USLHC | USA

View Blog | Read Bio

October, exercised

Here at CMS, we are in the midst of something that, I guess for lack of a better name, has been dubbed the “October exercise.” For the past week and the week to come, we have been trying to get as many people as possible to use the distributed computing system just as they would if they were doing a real analysis with real data. A new set of simulations have been released, and people are trying to work them through the system and their data analyses as quickly as possible, to demonstrate the turnaround time and the scale at which we will be hammering the computing clusters that are distributed around the world.

Halfway through, I would have to consider this at least something of a success. I don’t have anything resembling an accurate count of how many people have gotten involved, but it seems that we are seeing lots of people who had been just been doing their data-analysis work on local computing clusters now trying to use the grid for the first time. Tens of individual exercises have been designed by the dozen-ish CMS physics groups, each with multiple steps involving processing, writing and transferring data. As someone who has been working on the distributed computing for some years now, it is encouraging to see so many new people try out the system, and be successful more often than not.

On the other hand, it’s not as if everything has gone perfectly. A number of new tools and rules were developed just in advance of the exercise, and running these things out of the box at scale has been a bit bumpy. We were certainly aware of the weaknesses in the system, but now they are on full display. One thing that has proved particularly challenging is the “staging out” of outputs made by users in their processing jobs. In CMS computing, different datasets get distributed to different computing sites, and physicists who want to run on those datasets send their jobs to those sites. But everyone has a “home” site, and the output of the jobs has to be returned to the home site. This means that the data must be transferred from a somewhat random site X to the user’s site Y, and not every site Y can handle the volume of transfers that might be coming in. We’re keeping an eye on this and thinking about how we can improve it in the future.

After a week of this, I’d have to say that it’s somewhat exhausting to try to keep up with all that’s going on. And we don’t even have data yet — how exhausted will I be then? But on the flip side, I’m glad that we’re learning all of these lessons now, rather than a month or two from now.

Share

Tags: , ,