The Alan Sondheim Mail Archive

From the issue dated September 12, 2008

Data Deluge From Collider Prompts Next Big Information Revolution


When the Large Hadron Collider revs up to full capacity near Geneva, it will 
generate about 15 million gigabytes of data each year — enough to fill a stack 
of DVDs more than two miles high.

So much information will be pouring out that it will equal about 1 percent of 
the total data produced each year throughout the world, says François Grey, 
head of communications for information technology at CERN, the European 
particle-physics laboratory where the collider is located.

The collider project will need to sort and store every single bit and then 
make them available for physicists on every continent except Antarctica.

To meet this grand challenge, CERN has built up the LHC Computing Grid to 
handle the data and provide access for the 7,000 scientists from 500 universities 
and laboratories around the world who are participating in the experiment.

Often called the Grid, the distributed computing network will eventually link 
up 100,000 processors. About 20 percent of those CPU's sit in long rows of 
racks at CERN, with the rest spread around the globe at national labs and 

The computing facilities are distributed like the branches of a tree, with 
CERN as the main trunk, or Tier 0. It sends copies of all of the collider data 
to 11 major limbs called Tier 1 facilities.

The United States has two of these, at Brookhaven National Laboratory and at 
Fermi National Accelerator Laboratory, which each serve one of the major teams 
of researchers involved in the collider project.

The bulk of the computing power is spread out beyond these limbs, among 250 
smaller branches called Tier 2 centers.

The University of Texas at Arlington is the lead institution for one of the 
Tier 2 centers in the United States. The university has devoted 1,000 
processors and 500,000 gigabytes of storage to the project, says Kaushik De, a 
professor of physics there and the center's coordinator.

When a physicist at a university wants to analyze some collider data, she 
submits her job through her computer at her institution. The LHC Grid software 
then goes out looking for the data, the programs, and the computing power she 
needs for the job.

The request might land at a local Tier 2 facility or it might travel to a 
Tier 1 halfway around the world. Once the available processors have finished the 
analysis, the Grid sends back the results to her own computer. "The best 
analogy for the Grid is a farming cooperative," says Mr. Grey. "By sharing 
resources, we can use them more efficiently."

Unlike the World Wide Web, which was developed at CERN, the idea of a grid 
for distributed computing was conceived by researchers in the United States in 
the 1990s. The fields of astronomy, biomedicine, and earth sciences are already 
using computing grids, as are companies like IBM and Hewlett Packard.

But the LHC Grid will be the biggest test of this strategy yet, says Mr. 
Grey. "It's really putting the grid into practice."
  Section: The Faculty
  Volume 55, Issue 3, Page A15 
Copyright © 2008 by The Chronicle of Higher Education

Psssst...Have you heard the news? There's a new fashion blog, 
plus the latest fall trends and hair styles at


Generated by Mnemosyne 0.12.