SCS nameplate
 
   
   
   
   
   
   
   
   
   
   
     
   
   
   
   
     
 
  The Scientific Cluster Support Program  
 

Overview

SCS Team

The trend in high performance computing is towards the use of Linux clusters. Concurrently, there has been a growing interest in the use of Linux clusters for scientific research at Berkeley Lab. For many, a cluster assembled from inexpensive commodity off-the-shelf hardware and open source software promises to be a cost effective way to obtain a high performance system.

Though many of the concepts are simple, it remains difficult for scientists to navigate a myriad of technologies in order to arrive at a cluster configuration that will meet their needs. Similarly, it is harder to efficiently manage a multi-node compute cluster than it is to do the same for a desktop workstation. Consequently, early adopters of this technology have had to invest large amounts of effort to realize the full potential of their systems. Findings from the Berkeley Lab Midrange Computing Workshop (March 2002) and subsequent discussions with scientists identified a need for affordable centralized support

The Scientific Cluster Support program was developed to address the difficulties of obtaining and running a Linux cluster system. The ultimate goal being to increase the use of scientific computing to Lab research projects, to introduce parallel computing to Berkeley Lab researchers and to develop efficient, cost-effective methods for managing production clusters.


Program Description

Ten research projects from seven of the Lab's scientific Divisions were selected to participate in the 4 year Laboratory-funded program after a Lab-wide application process that was completed in September 2002. These projects are eligible to receive the following services:

  • Pre-purchase consulting - Understand customer application; Determine cluster hardware architecture and interconnect; Identify required software;
  • Procurement assistance - Assistance with developing a budget, development of RFP.
  • Setup and configuration - This includes installation and setup of the cluster hardware and networking; and installation and configuration of cluster software, scheduler, and applications software
  • Ongoing systems administration and cyber security - operating system and cluster software maintenance and upgrades; security updates; monitoring of cluster nodes; user accounts
  • Computer room space with networking and cooling - Clusters will be hosted in the Computing Sciences computer room in building 50B to insure access to sufficient electrical, cooling, and networking infrastructure.

The SCS Steering Committee, a small group composed of selected CSAC members, end users, and technical experts, will aid in the decision making and priority setting of the project.


Requirements

Systems in the SCS Program must meet the following requirements to be eligible for support:

  • IA32 or AMD64 architecture
  • Participating cluster must have a minimum of 8 compute nodes
  • Dedicated cluster architecture. No interactive logins on compute nodes
  • Red Hat Linux operating system
  • Warewulf cluster implementation toolkit
  • Sun Grid Engine scheduler
  • All slave nodes only reachable from master node

Clusters that will be located in the 50B-1275 Computer room must meet the following additional requirements

  • Rack mounted hardware required. Desktop form factor hardware not allowed
  • Equipment to be installed into APC Netshelter VX computer racks. Prospective cluster owners should include the cost of these racks into their budget

Cluster owners should check the SCS Service Level Agreement for a full description of the program provisions and requirements.


Schedule

New systems will be phased into the SCS program over the course of the first year. Existing clusters will be added to the program in year 2.

The entire start to finish process for specifying, ordering, installing and configuring a new cluster takes about 2 months. Cluster owners should anticipate this delay and plan accordingly

The research projects that had planned to purchase their cluster this year in fy03 were originally scheduled to go into the program first so that their purchase funds are utilized in a timely manner. Research projects that have existing clusters will be phased into the program for support in fy04.

The following clusters were placed in the program in fy03

  • Arup Chakraborty Research Group - Jan 2003 (Completed)
  • Ashok Gadgil and Patricia Brown - March 2003 (Completed)
  • Mike G. Hoversten and Ernest L. Majer - May 2003 (Completed)
  • William H. Miller - May 2003 (Completed)
  • William A. Lester - Aug 2003 (Completed)
  • Michael B. Eisen Aug - 2003 (Completed)

The following clusters were phased into the program in fy04

  • Steven Brenner, Paul D. Adams, Sung-Hou Kim, Stephen Holbrook - Dec 2003 (Completed)
  • Priscilla Cooper and John Tainer - Dec 2003 (Completed)
  • Martin Head-Gordon Jun - 2004 (Completed)
  • William A. Lester - Cluster Upgrade - Jul 2004 (Completed)

In fy05, these clusters were added to the program:

  • Steven G. Louie, Marvin L. Cohen - Nov 2004 (Completed)
  • Gretina Project - March 2005 (Scheduled)
 
 

Office of Science DOE logo
 
     

 

Berkeley Lab logo Scientific Cluster Support (SCS) Information Technologies & Services Division