A Large Ion Collider Experiment (ALICE) studies fundamental properties of the nuclear matter produced by the high energy collisions at the Large Hadron Collider. At the core of this research lies the collection of immense amounts of data for analysis. This data goes through a meticulous and very compute intensive process from the detector to the analysis ready physics object. Along with ALICE, three other LHC experiments, ATLAS, CMS, and LHCb, conduct experiments, collect data, run extensive computations, and analyze the output for the physics research.
Computing for the LHC experiments is organized under the Worldwide LHC Computing Grid (WLCG) collaboration. WLCG has adopted the tiered approach to the data processing and storing, where Tier-0 centers for the experiments are located at CERN next to each detector. Tier-1 centers reconstruct and host the raw data, and Tier-2 centers run simulation and data analysis, store analysis ready datasets and are distributed around the globe. However, it has become apparent that a new kind of computing facility will be beneficial for accelerating the data calibration, analysis tuning, and final physics analysis process. This is mainly due to the fact that physics simulation and physics data analysis jobs have a very different payload profile when it comes to computing. Such Analysis Facilities (AF) will be optimized towards the I/O intensive data analysis jobs and host the fraction of the datasets on which analysis can be tuned with a quick turnover.
ALICE introduced the idea of AFs in its 2015 TDR. It envisioned staging of about 10 PBs of the analysis ready data distributed worldwide across AFs. The analysis jobs will run on 10 PB of data daily that will then be replaced with another subset of similar capacity. Starting in 2018 such AFs started emerging at GSI, Darmstadt, Germany, and Wigner Center, Hungary.
Recently, WLCG started to evaluate the importance and effectiveness of such AFs for all LHC experiments and they see the need and benefit as well. ALICE on the other hand is leading this process.
On September 9, 2023, almost exactly one year ago, the ALICE-USA Computing Project, which operates two ALICE T2 sites to fulfill US computing obligations on behalf of the 11 US institutions participating in the experiment, deployed the first US hosted AF at LBL. With this deployment, we demonstrated the ability to operate such a facility with extremely high availability and reliability. With the prototype in place as a proof of concept and operating effectively without any downtime we plan to grow this resource to the full scale AF. It will enable US scientists to host, control, and have access to one of the largest analysis ready Heavy Ion datasets collected.
The ALICE-USA Computing Project is a collaboration between Lawrence Berkeley National Laboratory and Oak Ridge National Laboratory. The project is managed by NSD’s Irakli Chakaberia. The LBNL site administration is performed by John White and Karen Fernsler from the LBNL ITD group.
Figure 1: CPU capacity delivered to the ALICE experiment by the new AF.
Figure 2: Share of a different workload on the new AF since its deployment.