Castor: CERN Advanced STORage Manager

Published: (June 4, 2026 at 03:52 PM EDT)
2 min read

Source: Hacker News

The CERN Advanced STORage manager (CASTOR) is a hierarchical storage system (disk + tape) developed at CERN for archiving physics data with very large volumes. Files can be stored, listed, retrieved and remotely accessed using CASTOR command‑line tools or applications built on the CASTOR API. CASTOR provides access protocols such as XROOT (the recommended protocol) and GridFTP; RFIO was supported until 2016.

CASTOR succeeded SHIFT, the Scalable Heterogeneous Integrated FaciliTy for HEP computing from the 1990s. As of 29 June 2020, the CERN Tape Archive (CTA) began operating as CASTOR’s successor and is gradually replacing it. The evolution of total data on tape at CERN since 2001 (including statistics from CASTOR 1 1998‑2007, CASTOR 2 2005‑2022, and CTA 2020‑present) is shown in the accompanying plot.

Design

The architecture follows a component‑based design (Architecture diagram) centered on a database that records state changes of all CASTOR components.

  • Stager manages disk pools, allocating and reclaiming space, controlling client access, and maintaining the local disk catalogue.
  • Name Server stores the CASTOR namespace (files and directories) together with metadata (size, dates, checksum, ownership, ACLs, tape‑copy information). Unix‑style command‑line tools (e.g., nslsls) allow manipulation of the namespace.
  • Tape Infrastructure writes files to tape under defined conditions to ensure data safety and to extend storage beyond available disk capacity. CERN’s high‑capacity tape units include Oracle StorageTek T10000C (5 TB) and IBM TS1140 (4 TB). Cartridges reside in automated libraries (4 × Oracle SL8500 and 3 × IBM TS3500), giving a total tape archive capacity of ~100 PB (January 2013).
    Oracle StorageTek T10000C
  • Volume Manager tracks each tape’s characteristics, capacity, and status. The Name Server database records file‑level details on tape (ownership, permissions, offset location). User commands can query both databases.
  • Volume Drive Queue Manager (VDQM), together with library‑specific control software, handles mounting and dismounting of cartridges to tape drives.
  • Client provides users with upload, download, access, and management capabilities for CASTOR data.
  • Storage Resource Management (SRM) enables data access in a computing Grid via the SRM protocol, interfacing with CASTOR on behalf of users or services such as the File Transfer System (FTS) used by the LHC community.

Tape storage offers a much lower cost per terabyte than hard disks and consumes no electricity when idle, though access times are on the order of minutes rather than seconds.

0 views
Back to Blog

Related posts

Read more »

The desperation of NYTimes

Background I recently subscribed to The New York Times to read an article behind a paywall. The $2.00‑a‑month price didn’t bother me, but what followed after I...

Sagrada Família Lego set

Product Sagrada Família 21065https://www.lego.com/en-us/product/sagrada-familia-21065 Discussion Hacker News discussionhttps://news.ycombinator.com/item?id=484...

Retro-Tech Parenting

!A photograph taken on a wooden tabletop showing a collection of retro‑tech items including a CD player, a stack of CDs and a wired telephone/assets/images/retr...