• Frank
  • Simon
  • MPI for Physics
  • Germany

Latest Posts

  • Flip
  • Tanedo
  • USLHC
  • USA

Latest Posts

  • Aidan
  • Randle-Conde
  • USLHC
  • USA

Latest Posts

  • Jonathan
  • Asaadi
  • Syracuse University
  • USA

Latest Posts

  • Byron
  • Jennings
  • TRIUMF
  • Canada

Latest Posts

  • Seth
  • Zenz
  • USLHC
  • USA

Latest Posts

  • Alexandre
  • Fauré
  • CEA/IRFU
  • FRANCE

Latest Posts

  • Jim
  • Rohlf
  • USLHC
  • USA

Latest Posts

  • Emily
  • Thompson
  • USLHC
  • Switzerland

Latest Posts

  • Ken
  • Bloom
  • USLHC
  • USA

Latest Posts

Posts Tagged ‘computing’

The art of data mining is about searching for the extraordinary within a vast ocean of regularity. This can be a painful process in any field, but especially in particle physics, where the amount of data can be enormous, and ‘extraordinary’ means a new understanding about the fundamental underpinnings of our universe. Now, a tool first conceived in 2005 to manage data from the world’s largest particle accelerator may soon push the boundaries of other disciplines. When repurposed, it could bring the immense power of data mining to a variety of fields, effectively cracking open the possibility for more discoveries to be pulled up from ever-increasing mountains of scientific data.

Advanced data management tools offer scientists a way to cut through the noise by analyzing information across a vast network. The result is a searchable pool that software can sift through and use for a specific purpose. One such hunt was for the Higgs boson, the last remaining elementary particle of the Standard Model that, in theory, endows other particles with mass.

With the help of a system called PanDA, or Production and Distributed Analysis, researchers at CERN’s Large Hadron Collider (LHC) in Geneva, Switzerland discovered such a particle by slamming protons together at relativistic speeds hundreds of millions of times per second. The data produced from those trillions of collisions—roughly 13 million gigabytes worth of raw information—was processed by the PanDA system across a worldwide network and made available to thousands of scientists around the globe. From there, they were able to pinpoint an unknown boson containing a mass between 125–127 GeV, a characteristic consistent with the long-sought Higgs.

An ATLAS event with two muons and two electrons - a candidate for a Higgs-like decay. The two muons are picked out as long blue tracks, the two electrons as short blue tracks matching green clusters of energy in the calorimeters. ATLAS Experiment © 2012 CERN.

The sheer amount of data arises from the fact that each particle collision carries unique signatures that compete for attention with the millions of other collisions happening nanoseconds later. These must be recorded, processed, and analyzed as distinct events in a steady stream of information. (more…)

Share

This article first appeared in ISGTW Dec. 21, 2011.

A night-time view of the Tevatron. Photo by Reidar Hahn.

This is the first part of a two-part series on the contribution Tevatron-related computing has made to the world of computing. This part begins in 1981, when the Tevatron was under construction, and brings us up to recent times. The second part will focus on the most recent years, and look ahead to future analysis.

Few laypeople think of computing innovation in connection with the Tevatron particle accelerator, which shut down earlier this year. Mention of the Tevatron inspires images of majestic machinery, or thoughts of immense energies and groundbreaking physics research, not circuit boards, hardware, networks, and software.

Yet over the course of more than three decades of planning and operation, a tremendous amount of computing innovation was necessary to keep the data flowing and physics results coming. In fact, computing continues to do its work. Although the proton and antiproton beams no longer brighten the Tevatron’s tunnel, physicists expect to be using computing to continue analyzing a vast quantity of collected data for several years to come.

When all that data is analyzed, when all the physics results are published, the Tevatron will leave behind an enduring legacy. Not just a physics legacy, but also a computing legacy.

In the beginning: The fixed-target experiments

This image of an ACP system was taken in 1988. Photo by Reidar Hahn.

1981. The first Indiana Jones movie is released. Ronald Reagan is the U.S. President. Prince Charles makes Diana a Princess. And the first personal computers are introduced by IBM, setting the stage for a burst of computing innovation.

This image of an ACP system was taken in 1988. Photo by Reidar Hahn.Meanwhile, at the Fermi National Accelerator Laboratory in Batavia, Illinois, the Tevatron has been under development for two years. And in 1982, the Advanced Computer Program formed to confront key particle physics computing problems. ACP tried something new in high performance computing: building custom systems using commercial components, which were rapidly dropping in price thanks to the introduction of personal computers. For a fraction of the cost, the resulting 100-node system doubled the processing power of Fermilab’s contemporary mainframe-style supercomputers.

“The use of farms of parallel computers based upon commercially available processors is largely an invention of the ACP,” said Mark Fischler, a Fermilab researcher who was part of the ACP. “This is an innovation which laid the philosophical foundation for the rise of high throughput computing, which is an industry standard in our field.”

The Tevatron fixed-target program, in which protons were accelerated to record-setting speeds before striking a stationary target, launched in 1983 with five separate experiments. When ACP’s system went online in 1986, the experiments were able to rapidly work through an accumulated three years of data in a fraction of that time.

Entering the collider era: Protons and antiprotons and run one

1985. NSFNET (National Science Foundation Network), one of the precursors to the modern Internet, is launched. And the Tevatron’s CDF detector sees its first proton-antiproton collisions, although the Tevatron’s official collider run one won’t begin until 1992.

The experiment’s central computing architecture filtered incoming data by running Fortran-77 algorithms on ACP’s 32-bit processors. But for run one, they needed more powerful computing systems.

By that time, commercial workstation prices had dropped so low that networking them together was simply more cost-effective than a new ACP system. ACP had one more major contribution to make, however: the Cooperative Processes Software.

CPS divided a computational task into a set of processes and distributed them across a processor farm – a collection of networked workstations. Although the term “high throughput computing” was not coined until 1996, CPS fits the HTC mold. As with modern HTC, farms using CPS are not supercomputer replacements. They are designed to be cost-effective platforms for solving specific compute-intensive problems in which each byte of data read requires 500-2000 machine instructions.

CPS went into production-level use at Fermilab in 1989; by 1992 it was being used by nine Fermilab experiments as well as a number of other groups worldwide.

1992 was also the year that the Tevatron’s second detector experiment, DZero, saw its first collisions. DZero launched with 50 traditional compute nodes running in parallel, connected to the detector electronics; the nodes executed filtering software written in Fortran, E-Pascal, and C.

Gearing up for run two

"The Great Wall" of 8mm tape drives at the Tagged Photon Laboratory, circa 1990 - from the days before tape robots. Photo by Reidar Hahn.

1990. CERN’s Tim Berners-Lee launches the first publicly accessible World Wide Web server using his URL and HTML standards. One year later, Linus Torvalds releases Linux to several Usenet newsgroups. And both DZero and CDF begin planning for the Tevatron’s collider run two.

Between the end of collider run one in 1996 and the beginning of run two in 2001, the accelerator and detectors were scheduled for substantial upgrades. Physicists anticipated more particle collisions at higher energies, and multiple interactions that were difficult to analyze and untangle. That translated into managing and storing 20 times the data from run one, and a growing need for computing resources for data analysis.

Enter the Run Two Computing Project (R2CP), in which representatives from both experiments collaborated with Fermilab’s Computing Division to find common solutions in areas ranging from visualization and physics analysis software to data access and storage management.

R2CP officially launched in 1996. It was the early days of the dot com era. eBay had existed for a year, and Google was still under development. IBM’s Deep Blue defeated chess master Garry Kasparov. And Linux was well-established as a reliable open-source operating system. The stage is set for experiments to get wired and start transferring their irreplaceable data to storage via Ethernet.

The high-tech tape robot used today. Photo by Reidar Hahn.

“It was a big leap of faith that it could be done over the network rather than putting tapes in a car and driving them from one location to another on the site,” said Stephen Wolbers, head of the scientific computing facilities in Fermilab’s computing sector. He added ruefully, “It seems obvious now.”

The R2CP’s philosophy was to use commercial technologies wherever possible. In the realm of data storage and management, however, none of the existing commercial software met their needs. To fill the gap, teams within the R2CP created Enstore and the Sequential Access Model (SAM, which later stood for Sequential Access through Meta-data). Enstore interfaces with the data tapes stored in automated tape robots, while SAM provides distributed data access and flexible dataset history and management.

By the time the Tevatron’s run two began in 2001, DZero was using both Enstore and SAM, and by 2003, CDF was also up and running on both systems.

Linux comes into play

The R2CP’s PC Farm Project targeted the issue of computing power for data analysis. Between 1997 and 1998, the project team successfully ported CPS and CDF’s analysis software to Linux. To take the next step and deploy the system more widely for CDF, however, they needed their own version of Red Hat Enterprise Linux. Fermi Linux was born, offering improved security and a customized installer; CDF migrated to the PC Farm model in 1998.

The early computer farms at Fermilab, when they ran a version of Red Hat Linux (circa 1999). Photo by Reidar Hahn.

Fermi Linux enjoyed limited adoption outside of Fermilab, until 2003, when Red Hat Enterprise Linux ceased to be free. The Fermi Linux team rebuilt Red Hat Enterprise Linux into the prototype of Scientific Linux, and formed partnerships with colleagues at CERN in Geneva, Switzerland, as well as a number of other institutions; Scientific Linux was designed for site customizations, so that in supporting it they also supported Scientific Linux Fermi and Scientific Linux CERN.

Today, Scientific Linux is ranked 16th among open source operating systems; the latest version was downloaded over 3.5 million times in the first month following its release. It is used at government laboratories, universities, and even corporations all over the world.

“When we started Scientific Linux, we didn’t anticipate such widespread success,” said Connie Sieh, a Fermilab researcher and one of the leads on the Scientific Linux project. “We’re proud, though, that our work allows researchers across so many fields of study to keep on doing their science.”

Grid computing takes over

As both CDF and DZero datasets grew, so did the need for computing power. Dedicated computing farms reconstructed data, and users analyzed it using separate computing systems.

“As we moved into run two, people realized that we just couldn’t scale the system up to larger sizes,” Wolbers said. “We realized that there was really an opportunity here to use the same computer farms that we were using for reconstructing data, for user analysis.”

A wide-angle view of the modern Grid Computing Center at Fermilab. Today, the GCC provides computing to the Tevatron experiments as well as the Open Science Grid and the Worldwide Large Hadron Collider Computing Grid. Photo by Reidar Hahn.

Today, the concept of opportunistic computing is closely linked to grid computing. But in 1996 the term “grid computing” had yet to be coined. The Condor Project had been developing tools for opportunistic computing since 1988. In 1998, the first Globus Toolkit was released. Experimental grid infrastructures were popping up everywhere, and in 2003, Fermilab researchers, led by DZero, partnered with the US Particle Physics Data Grid, the UK’s GridPP, CDF, the Condor team, the Globus team, and others to create the Job and Information Management system, JIM. Combining JIM with SAM resulted in a grid-enabled version of SAM: SAMgrid.

“A pioneering idea of SAMGrid was to use the Condor Match-Making service as a decision making broker for routing of jobs, a concept that was later adopted by other grids,” said Fermilab-based DZero scientist Adam Lyon. “This is an example of the DZero experiment contributing to the development of the core Grid technologies.”

By April 2003, the SAMGrid prototype was running on six clusters across two continents, setting the stage for the transition to the Open Science Grid in 2006.

From the Tevatron to the LHC – and beyond

Throughout run two, researchers continued to improve the computing infrastructure for both experiments. A number of computing innovations emerged before the run ended in September 2011. Among these was CDF’s GlideCAF, a system that used the Condor glide-in system and Generic Connection Brokering to provide an avenue through which CDF could submit jobs to the Open Science Grid. GlideCAF served as the starting point for the subsequent development of a more generic glidein Work Management System. Today glideinWMS is used by a wide variety of research projects across diverse research disciplines.

Another notable contribution was the Frontier system, which was originally designed by CDF to distribute data from central databases to numerous clients around the world. Frontier is optimized for applications where there are large numbers of widely distributed clients that read the same data at about the same time. Today, Frontier is used by CMS and ATLAS at the LHC.

“By the time the Tevatron shut down, DZero was processing collision events in near real-time and CDF was not far behind,” said Patricia McBride, the head of scientific programs in Fermilab’s computing sector. “We’ve come a long way; a few decades ago the fixed-target experiments would wait months before they could conduct the most basic data analysis.”

One of the key outcomes of computing at the Tevatron was the expertise developed at Fermilab over the years. Today, the Fermilab computing sector has become a worldwide leader in scientific computing for particle physics, astrophysics, and other related fields. Some of the field’s top experts worked on computing for the Tevatron. Some of those experts have moved on to work elsewhere, while others remain at Fermilab where work continues on Tevatron data analysis, a variety of Fermilab experiments, and of course the LHC.

The accomplishments of the many contributors to Tevatron-related computing are noteworthy. But there is a larger picture here.

“Whether in the form of concepts, or software, over the years the Tevatron has exerted an undeniable influence on the field of scientific computing,” said Ruth Pordes, Fermilab’s head of grids and outreach. “We’re very proud of the computing legacy we’ve left behind for the broader world of science.”

– Miriam Boon

Share

On May 26, 2005, a new supercomputer, a pioneering giant of its time, was unveiled at Brookhaven National Laboratory at a dedication ceremony attended by physicists from around the world. That supercomputer was called QCDOC, for quantum chromodynamics (QCD) on a chip, capable of handling the complex calculations of QCD, the theory that describes the nature and interactions of the basic building blocks of the universe. Now, after a career of state-of-the-art physics calculations, QCDOC has been retired — and will soon be replaced by a new “next generation” machine. (more…)

Share

Steve in Geneva

Wednesday, October 19th, 2011

Here I am at CERN, for the first time in more than three months. When I was here this summer, I stayed for five weeks and had my family along with me. Now I’m just here for a short stay and rooming in the hostel again. But in some ways, it feels like I never left. (Except for the jet lag, of course.) The exciting times continue on the LHC experiments. We are under two weeks from the end of this year’s proton run, and we are eager to gather every last bit of data we can before the heavy-ion run and then a technical stop that won’t end until sometime in March. The dataset that we will end with will be more than twice as big as that which we analyzed for results that went to conferences this summer, so it will be very interesting to see what emerges with the additional data.

You might not have heard, but since the last time I posted, Apple co-founder and CEO Steve Jobs died. Obviously Jobs had a huge impact on how we live in our technological world. In the days after his death, I read articles discussing his influence on computing, design, music, publishing, politics, and so forth. Eager to jump onto the bandwagon, I decided to take a pilgrimage to the CERN visitor center at the Globe, located across the street from the Meyrin site. There, you can find this computer in a display area:

A NeXT computer, from 1990.

The ratty sticker on the front implores passers-by not to shut down the computer. The computer is a NeXT, a product of the company that Jobs founded after he was forced out of Apple in the 1980′s. This happens to be the computer that belonged to Tim Berners-Lee, the first developer of what we now know as the World Wide Web, and it hosted the first Web server. (Do not shut down, indeed! Someone on the other side of the world might be using that computer.)

It’s true, we trot this one out a lot in particle physics, but the Web was invented by particle physicists to be used as an information and document sharing system, and it ended up changing the world. Particle physics has driven many developments in computer science over the years, as we’ve long had large datasets and computationally-intensive problems. These days, I feel like I see a lot of back and forth between particle physics and the computing world. Because of the scale of the data volume that we serve and the number of users who want to access it, and because we’re trying to do it on the relatively cheap, we’ve moved to a model of distributed computing that is realized in the Worldwide LHC Computing Grid. Grid computing, which allows straightforward access to computing resources owned by others that aren’t being used at the moment, has been adopted across sciences that do large-scale computing, and cloud computing is an offshoot of this development.

At the same time, we are definitely making use of computing technologies that have been developed in the commercial world. My favorite example of this is Hadoop. It’s a very powerful set of tools, and many US LHC computing sites are using its disk-management system, which is also used by Web sites like Facebook. It has good scaling properties and is easy to maintain, making life easier for site operators. We’re always on the lookout for new ideas that we can bring in from the computing world that will make it easier for physicists to make the most out of the LHC data.

Thanks to all of these tools, someone — perhaps very soon — will be making a plot that could show evidence for new physical phenomena. It wouldn’t be possible without the computing systems that I just described. Will this plot be viewed for the first time on the screen of an Apple product? Will that very screen end up in a display at the Globe? We’ll see.

Share

Coming attractions at the LHC

Friday, September 2nd, 2011

It’s Labor Day weekend here in the US, but over at CERN it’s the end of the August technical stop for the LHC. To rework a common saying, this is the first day of the rest of the 2011 run. We have two months left of proton-proton collisions, followed by one month of lead-lead collisions, and then in December we’ll have the holiday “extended technical stop” that will probably extend to the spring.

We’re expecting an important change in running conditions once we return from the technical stop, and that is a change in how the beams are focused. This will lead to an increased rate of collisions. Remember that the proton beams are “bunched”; the beam is not a continuous stream of particles but bunches with a large separation between them. The change in the focusing will help make the bunches more compact, and that in turn will mean that there will be more proton collisions every time a pair of bunches pass through each other. When our detectors record data, they record an entire bunch crossing as a single event. Thus, each individual event will be busier, with more collisions and more particles produced.

This is good news from a physics perspective — the more collisions happen, the greater the chance that there will be something interesting coming out. But it’s a challenge from an operational perspective. We try to record as many “interesting” events as possible, but we’re ultimately limited by how quickly we can read out the detector and how much space we have to store the data. Given that we’re going to have more data coming into fixed resources, we’re going to have to limit our definition of “interesting” a little further. The busier events are also a greater strain on the software and computing for the experiments (which I focus on). Each event takes more CPU time to process and requires more RAM. Previous experience and simulations give us some guidance as to how all of this will scale up from what we’ve seen so far, but we can’t know for sure without actually doing it. (The original plan for the machine development studies period before the technical stop was supposed to include a small-scale test of this, so that we could put the computing and everything else through its paces. But that got cancelled. I had originally planned to blog about that. Oh well.)

However, all of this will be worth the trouble. Remember all of the excitement of the EPS conference? That was at the end of July, just a little more than a month ago. There is now about twice as much data that can be analyzed. With the increases in collision rate, we might well be able to double the dataset once again just in these next two months. Or, we might do even better. This will have a critical impact on our searches for new phenomena, and could allow the LHC experiments to discover or rule out the standard-model Higgs boson by the end of this year. Coming soon, to a theater near you.

Share

Fermilab theoretical physicist Paul Mackenzie, spokesperson for the USQCD collaboration. Click on image for higher resolution version. Photo credit: Reidar Hahn.

The field of high-energy physics has always considered itself a family. To address some of the largest questions, such as how were we and the universe formed, it takes building-sized machines, enormous computing power and more resources than one nation can muster. This necessary collaboration has forged strong bonds among physicists and engineers across the globe.

So naturally when March 11 a tsunami and series of earthquakes struck Japan, home to one of the world’s largest high-energy physics laboratories and an accelerator research center, physicists in the U.S. started asking how they could help. It turns out that they have a unique resource to offer: computer power.

Lattice Quantum Chromodynamics (QCD)is a computational technique used to study the interactions of quarks and gluons and requires vast computing power. To help the Japanese continue this analysis, Fermilab and other U.S. labs will share their Lattice QCD computing resources.

 “We’re very happy that the shared use of our resources can allow our Japanese colleagues to continue their research during a time of crisis,” said Fermilab theoretical physicist Paul Mackenzie, spokesperson for the USQCD collaboration.

From now until the end of 2011, while computing facilities in eastern Japan face continuing electricity shortages, a percentage of the computing power at Brookhaven National Laboratory on Long Island, Fermi National Accelerator Laboratory near Chicago and Thomas Jefferson National Accelerator Facility in Virginia will be made available to the Japanese Lattice Quantum Chromodynamics (QCD) community.

“We appreciate the support from the U.S. QCD community,” said University of Tsukuba Vice President Akira Ukawa, spokesperson of the Japanese Lattice QCD community. “The sharing of resources will not only be instrumental to continue research in Japan through the current crisis, but will also mark a significant step in strengthening the international collaboration for progress in our field.”

Read the Fermilab press release here: http://www.fnal.gov/pub/presspass/press_releases/2011/USQCDrelease_052311.html

Related news:

 Japanese helped foreign scientists during quake

Japanese earthquake jolts Tevatron, emotions

Damage caused by the recent earthquake and recovery prospects

Share

This story appeared in Fermilab Today March 3.

The Linux operating system produced at Fermilab enabled the laboratory, and other high-energy physics institutions to build large physics data analysis clusters using affordable, commercially available computers. The photo shows computer clusters in the laboratory's Grid Computing Center. Credit: Fermilab

The Linux operating system produced at Fermilab enabled the laboratory, and other high-energy physics institutions to build large physics data analysis clusters using affordable, commercially available computers. The photo shows computer clusters in the laboratory’s Grid Computing Center.

For more than 12 years, Fermilab has supplied thousands of individuals in the scientific community with the operating system that forms the foundation for their exploration of the universe’s secrets. The Linux operating system produced at Fermilab enabled the laboratory, and other high-energy physics institutions to build large physics data analysis clusters using affordable, commercially available computers.

The newest version of the Scientific Linux is now available.

Fermilab began packaging and distributing Scientific Linux in 2004 to the broad high-energy physics community. At that time, it was used on only 1,500 machines. Today, Scientific Linux is run on tens of thousands of machines and is the operating system that powers some of the world’s largest physics experiments, including some experiments at the Large Hadron Collider. The newest version, Scientific Linux 6, is put together by the Fermilab Computing Division, specifically the Fermilab Experiments Facilities Department, and by DESY, CERN and other laboratories and universities across the world.

“This version of Scientific Linux continues a tradition of technical excellence,” said Jason Allen, head of Fermilab Experiments Facilities Department in the laboratory’s Computing Division. “This product is the result of users worldwide who have contributed, tested and provided feedback for this release.”

Fermilab modifies Scientific Linux, the base product, to include security measures and other laboratory-specific elements to create Scientific Linux Fermi. The newest version of Scientific Linux Fermi 6 will be released at Fermilab later this year.

 – Kimberly Myles and Edward Simmonds

Share

Top left image shows SDSS-III's view of a small part of the sky, centered on the galaxy Messier 33. The middle top picture is a zoomed-in image on M33, showing the spiral arms of this galaxy, including the blue knots of intense star formation. The top right-hand image shows a further zoomed-in image of M33 highlighting one of the largest areas of intense star formation in that galaxy. Credit: SDSS

The world’s largest, digital, color image of the night sky became public this month. It provides a stunning image and research fodder for scientists and science enthusiasts, thanks to the Sloan Digital Sky Survey, which has a long connection to Fermilab.

Oh, yeah, and the image is  free.

The image, which would require 500,000 high-definition TVs to view in its full resolution, is comprised of data collected since the start of the survey in 1998.

“This image provides opportunities for many new scientific  discoveries in the years to come,” said Bob Nichol, SDSS-III scientific spokesperson and professor at University of Portsmouth.

Fermilab oversaw all image processing and distribution of data to researchers and the public from 1998 through 2008, for the first seven batches of data. These batches make up a large chunk of the ground-breaking more than a trillion-pixel image. The eighth batch of raw, reduced data, which was released along with the image at the 17th annual meeting of the American Astronomical Society in Seattle was processed by Lawrence Berkley National Laboratory. LBNL, New York University and Johns Hopkins University distributed that data. Fermilab’s SDSS collaboration members now focus solely on analysis.

“This is one of the biggest bounties in the history of science,” said Mike Blanton, professor from New York University and leader of the data archive work in SDSS-III, the third phase of SDSS.  ”This data will be a legacy for the ages, as previous ambitious sky surveys like the Palomar Sky Survey of the 1950s are still being used today. We expect the SDSS data to have that sort of shelf life.”

The release expands the sky coverage of SDSS to include a  sizable view of the south galactic pole. Previously, SDSS only imaged small, spread out slivers of the southern sky. Increasing coverage of the southern sky will aid the Dark Energy Survey and the Large Synoptic Survey Telescope both southern sky surveys that Fermilab participates in.

Comparing the two portions of the sky also will help astrophysicists pinpoint any asymmetries in the type or number of large structures, such as galaxies. Cosmic-scale solutions to Albert Einstein’s equations of general
relativity assume that the universe is spherically symmetric, meaning that on a large enough scale, the universe would look the same in every direction.

Finding asymmetry would mean the current understanding of the universe is wrong and turn the study of cosmology on its head, much as the discovery of particles not included in the Standard Model would do for collider physics.

“We would have to rethink our understanding of cosmology,” said Brian Yanny, Fermilab’s lead scientists on SDSS-III. So far the universe seems symmetric.

Whether the SDSS data reveals asymmetry or not it undoubtedly will continue to provide valuable insight into our universe and fascinate amateur astronomers and researchers.

Every year since the start of the survey, at least one paper about the SDSS has made it in the list of the top 10 astronomy papers of the year. More than 200,000 people have classified galaxies from their home computers using SDSS data and projects including Galaxy Zoo and Galaxy Zoo 2.

In the three months leading up to the image’s release a record number of queries, akin to click counts on a Web page,  occurred on the seventh batch of data. During that time, 90 terabytes of pictures and sky catalogues were down loaded by  scientists and the public. That equates to about 150,000 one-hour long CDs.

Scientists will continue to use the old data and produce papers from it for years to come. Early data also works as a check on the new data to make sure camera or processing flaws didn’t produce data anomalies.

“We still see, for instance, data release six gets considerable hits and papers still come out on that in 100s per year,” Yanny said.

So far, SDSS data has been used to discover nearly half a billion astronomical objects, including asteroids, stars, galaxies and distant quasars. This new  eighth batch of data promises even more discoveries.

Fermilab passed the job of data processing and distribution on to others in 2008. The eight batch of data was processed by Lawrence Berkley National Laboratory and distributed by LBNL, New York University and Johns Hopkins University.

Fermilab’s four remaining SDSS collaboration members now focuses solely

illustration of the concept of baryon acoustic oscillations, which are imprinted in the early universe and can still be seen today in galaxy surveys like BOSS. Credit: Chris Blake and Sam Moorfield and SDSS.

on analysis. They are expected to produce a couple dozen papers during the next few years. The group touches on all of SDSS-III’s four sky surveys but focus mainly on the Baryon Oscillation Spectroscopic Survey, or BOSS, which will map the 3-D distribution of 1.5 million luminous red galaxies.

“BOSS is closest to our scientists’ interests because its science goals are to understand dark energy and dark matter and the evolution of the universe,” Yanny said.

For more information see the following:

* Larger images of the SDSS maps in the northern and southern galactic hemispheres are available here and here.

*Sloan’s YouTube channel provides a 3-D visualization of the universe.

*Technical journal papers describing DR8
and the SDSS-III project can be found on the arXiv e-Print server.

*EarthSky has a good explanation of what the colors in the images represent and how SDSS part of an on-going tradition of sky surveys.

*The Guardian newspaper has a nice article explaining all the detail that can be seen in the image.

– Tona Kunz

Share

 To celebrate its 30th anniversary, Discover magazine created a list of the The 12 Most Important Trends in Science Over the Past 30 Years. High-energy particle physics and Fermilab played a part in three of these 12 game-changing research break throughs. Here’s a look at these Discover-selected trends and Fermilab’s contributions to them.

 Trend: The Web Takes Over

Pictured is Fermilab's 2001 home page, which was designed in 1996. Twenty years ago, Fermilab helped to pioneer the URL. It launched one of the first Web sites in the country in 1992. Credit: Fermilab

The first concept for what would become the World Wide Web was proposed by a high-energy particle physicist in 1989 to help physicists on international collaborations share large amounts of data. The first WWW system was created for high-energy physicists in 1991 under the guidance of CERN. 

A year later, Fermilab became the second institution in the United States to launch a website. It also helped initiate the switch easy-to-remember domain name addresses rather than Internet Protocol addresses, which are a string of numbers. This switch helped spur the growth of the Internet and WWW.

Particle physics also secured a place in sports history through its computing savvy. A softball club at CERN, composed of mostly visiting European and American physicists, many connected to Fermilab, was the first ball club in the world to have a page on the World Wide Web, beating out any team from Major League Baseball.

Trend: Universe on a Scale

The field of cosmology has advanced and created a more precise understanding of the evolution and nature of the universe. This has brought high-energy particle physics, cosmology and astronomy closer together. They have begun to overlap in the key areas of dark energy, dark matter and the evolution of the universe.  Discover magazine cites as being particularly noteworthy in these areas the first precise measurement of cosmic microwave background, or CMB, radiation left over from the Big Bang and the discovery with the aid of supernovas that the  expansion of the universe is accelerating.

Dark Energy Camera under construction at Fermilab. Credit: Fermilab

Fermilab physicists study the CMB with the Q/A Imaging Experiment, or QUIET. They study dark energy with several experiments, most notably the long-running Sloan Digital Sky Survey , the Dark Energy Survey, which will be operational at the end of the year, and the Large Synoptic Survey Telescope, potentially operating at the end of the decade or mid-next decade.  

Trend: Physics Seeks the One

During the last few decades the particle physics community has sought to build a mammoth international machine that can probe the tiniest particles of matter not seen in nature since just after the time of the Big Bang.

Initially, this machine was planned for the United States and named the Superconducting Super Collider. Scientists and engineers from Fermilab help with the design and science suite of experiments for the SSC, which was under construction in Texas until it was canceled in 1993.

A similar machine, the Large Hadron Collider in Switzerland, did take shape, starting operation in 2008. Fermilab played a key role in the design, construction and R&D of the accelerator with expertise garnered through the Tevatron accelerator construction, cutting-edge superconducting magnet technology and project managers.

The U.S. CMS remote operation center at Fermilab. Credit: Fermilab

Fermilab now serves as a remote operation center for CMS, one of the two largest experiments at the LHC. Many physicists work on CMS as well as one of the Tevatron’s detector teams, DZero and CDF.  The United States has the largest national contingent within CMS, accounting for more than 900 physicists in the 3,600-member collaboration.

 Fermilab’s computing division serves as one of two “Tier-1″ computing distributions centers in the United States for LHC data. In this capacity, Fermilab provides storage and processing capacity for data collected at the LHC that is analyzed by physicists at Fermilab and sent to U.S. universities for analysis there.

Discover magazine cited as a goal of the LHC the search for the Higgs boson, a theorized particle thought to endow other particles with mass, which allows gravity to act upon them so they can form together to create everything in the visible world, such as people, planets and plants. The LHC and the Tevatron are racing to find the Higgs first. The Tevatron has an advantage searching in the lower mass range and the LHC in the higher mass range. Theorists suspect the Higgs lives in the lower mass range. So far, the Tevatron has greatly narrowed the possible hiding places for the Higgs in this range.

– Tona Kunz

Share

Millions of Simulations

Thursday, June 17th, 2010

Proton-Proton collision simulation "jobs" for the CMS detector running on the grid.

To compare with the data we record from our detector (CMS), we need to run a few simulations…well more like billions of simulations.

Each “job” in the plot above is actually a program running on a computer at a university.  Each program typically simulates a few hundred, or a few thousand, proton-proton collisions.  Each individual “collision simulation” calculates what a certain kind of collision would look like in our 12,500-ton detector.

And I don’t mean they just make pretty pictures.  A single simulation really consists of: some particles within each proton interact with some probability, they produce other particles with some probability, those particles decay to other particles with some probability, and so on…  Eventually, stable particles are made and the passage of those particles through the detector are also simulated.

As you can imagine, this requires a lot of random numbers.  One mistake that happens sometimes is that different jobs have the same initial ‘seed’ for the random numbers, and this results in duplication of simulations.  Not only is that a waste of CPU-cycles, but it also means a fuller range of collision possibilities doesn’t get simulated.

My job at times is to herd thousands of simulation jobs at a time to various places and monitor them, make sure they don’t crash, and finish in a timely fashion to return the needed data.

By the way, when I wrote the job monitoring script that makes plots like the one above (written in Python and using matplotlib), I tried using their school colors when I could, but sometimes that resulted in colors that were too similar or confusing.

Share