High Performance Computing at Berkeley Lab

BERKELEY LABORATORY RESEARCH COMPUTING
Berkeley Lab provides Lawrencium, a 795-node (10,360 computational cores) Linux cluster to its researchers needing access to computational to facilitate scientific discovery. The system, which consists of shared nodes and PI-contributed Condo nodes, has a theoretical peak performance rating of 200 teraflops, and has access to a 1PB parallel filesystem storage system. Large memory, GPU and Intel Phi nodes are also available for users to try.

SCIENTIFIC CLUSTER SUPPORT

HPC Services offers comprehensive Linux cluster support, including pre-purchase consulting, procurement assistance, installation, and ongoing support for PI-owned clusters. Altogether the group manages over 25,000 compute cores. Our HPC User Services consultants can help you to get your application running well to make best use of your new cluster. UC Berkeley PIs can also make use of our services through the new Berkeley Research Computing (BRC) program available through UCB Research IT.

NEWS
Nov 5, 2014 - Data Center Efficiency Summit 2014
Data centers consume approximately 2 percent of the nation’s electrical energy and roughly half of that is consumed by the IT equipment. We partnered with EETD researcher Henry Coles to evaluate server energy use and efficiency among commercially available servers that appear to be similar in design and performance. The results will be presented at the upcoming 2014 Data Center Efficiency Summit held in Santa Clara, CA.

Oct 16, 2014 - HPC Simulations provide a Path to Better Batteries
Batteries with a significant increase in energy density will be needed in the future for automotive and other applications. David Prendergast and Liwen Wan, scientists working in the Theory of Nanostructured Materials group at the Molecular Foundry, a DOE nanoscience research facility hosted by Berkeley Lab, ran a series of computer simulations on their VULCAN Linux Cluster managed by HPCS and at NERSC that dispelled a long-standing misconception about Mg-ions in the electrolyte that transports the ions between a battery’s electrodes. See more...

Sep 23, 2014 - SLURM User Group Meeting
HPCS systems analyst Jackie Scoggins will be giving a talk about how we successfully transitioned our scheduling environment to the SchedMD SLURM job scheduler at the SLURM User Group Meeting this week. She will also be giving a tutorial on using Berkeley Lab NHC (Node Health Check). SLURM is the workload manager on about half of the systems on the Top500 supercomputer list

Sep 10, 2014 - LabTech 2014 is here

Highlights of the day include our morning mini-classes, including 3 one hour sessions on getting the most out of Python in scientific computing, a 3 hour Arduino basics class, and, new this year, Intro and Advanced LabVIEW. The (free) Lunch and Keynote starts at noon, with an overview of what's new from IT this year. In the afternoon, you'll find over 30 sessions on topics ranging from Globus Online and HPC Use Cases to Video Conferencing to Excel. Please go here to register.



Aug 5, 2014 - NHC Talk at Linux Cluster Institute Workshop
The Linux Cluster Institute (LCI) holds an annual workshop to promote best practices for running HPC systems.  This year, Senior HPC engineer Michael Jennings will be leading sessions on the LBNL-developed Warewulf cluster toolkit and Berkeley Lab Node Health Check (NHC) at their 20th annual LCI Workshop held from August 4-8, 2014 at the National Center for Supercomputing Applications (NCSA) in Urbana, Illinois.

May 22, 2014 - BERKELEY RESEARCH COMPUTING Launch Event
The UC Berkeley Research Computing (BRC) celebrates its launch on Thursday, May 22, 2014 at 3:00 p.m. in Sutardja Dai Hall. This exciting new program will include Condo/Institutional cluster (HPC) computing (in partnership with LBNL HPCS), cloud computing support, and virtual workstations. UC Berkeley researchers needing access to computation should look into this program. To attend the event, RSVP here

April 15, 2014 HPCS presenting at GlobusWorld 2014
LBL HPC consultant Krishna Muriki and HPCS systems engineer Karen Fernsler will be presenting at GlobusWorld 2014 on April 15-17, 2014 at Argonne National Laboratory.Their talk "Globus for Big Data and Science Gateways at LBL" will highlight some of our projects, including the Sloan Digital Sky Survey III and the X-Ray Diffraction and Microtomography Beamlines at the Advanced Light Source user facility, which benefit from Globus endpoints to construct Data Pipelines.

Feb 3, 2014 - NHC Talk at Stanford Exascale Conference
LBL HPCS senior engineer Michael Jennings will be giving a talk on the "Node Health Check (NHC)" on Feb 4, 2014 at the Stanford Conference and Exascale Workshop 2014 sponsored by the HPC Advisory Council. NHC, developed by Jennings, provides the framework and implementation for a highly reliable, flexible, extensible node health check solution. It is now widely recommended by major HPC job scheduler vendors and is in use at many large HPC sites and research institutions.

Sept 6, 2013 - Monitoring with Ganglia book released
HPC Services cluster expert Bernard Li's new book "Monitoring with Ganglia - Tracking Dynamic Host and Application Metrics at Scale" has been recently published by O'Reilly media. His book shows how to use Ganglia to collect and visualize metrics from clusters, grids, and cloud infrastructures at any scale.

Mar 21, 2013 - Supporting Science with HPC
HPC Services manager Gary Jung gave a presentation on "Supporting Science with HPC" at the "Enabling Discovery and Production Innovation with Dell HPC Solutions" workshop held in Santa Clara where he talked about how different Berkeley Lab researchers from EETD, ESD, NSD, and Physics are using HPC data pipelines to accomplish their science.

Mar 18, 2013 - GPU Accelerated Synchrotron Radiation Calculation
Today, HPC Services consultant Yong Qin will be presenting his work during a poster session at this week's GPU Technology Conference, GTC 2013, in San Jose, California. Yong's work demonstrates how data parallelism can be applied to spectrum calculation of undulator radiation, which is widely used at synchrotron light facilities across the world. More

Mar 13, 2013 - The Science of Clouds computed on Lawrencium
The climate models that scientists use to understand and project climate change are improving constantly; however the largest source of uncertainty in today’s climate models are clouds. As the source of rain and wind, clouds are important in modeling climate. Berkeley Lab scientist David Romps discusses his work to develop better cloud resolution models. More

Nov 13, 2012 - Node Health Check
HPCS staffers Jackie Scoggins and Michael Jennings gave a well-attended presentation on Jenning's Node Health Check (NHC) software today at the Adaptive Computing booth at SC12. NHC works in conjunction with the job scheduler and resource manager to ensure clean job runs on large HPC systems.

Nov 12, 2012 - Warewulf wins Intel Cluster Ready "Explorer" award
The Berkeley Lab Warewulf Cluster Toolkit development team has been honored with the 'Explorer Award' from the Intel(R) Cluster Ready team at Intel, which recognizes organizations who have continued to explore and implement Intel Cluster Ready (ICR) certified systems. The award was presented to lead developer Greg Kurtzer, and co-developers Michael Jennings, and Bernard Li of the IT Division's HPC Services Group at the annual Intel Partners meeting.

Oct 24, 2012 - HPCS at 2012 Data Center Efficiency Summit
HPCS staff member Yong Qin will be part of a panel, along with other Berkeley Lab scientists, at the 2012 Data Center Efficiency Summit today in San Jose talking about Berkeley Lab's recently released study to understand the feasibility of implementing Demand Response and control strategies in Data Centers. Yong will discuss the issues and our experiences related to reducing or geographically shifting computational workload to a remote datacenter as a response to a demand to lower electrical usage.

Sept 18, 2012 - Warewulf featured in HPC Admin Magazine
This month's issue of HPC Admin Magazine features the last in a 4-part series on how to best use the latest version of the Warewulf Cluster Toolkit. Warewulf, developed by LBNL's Greg Kurtzer and recently certified as Intel Cluster Ready, is a zero-cost, open source solution that guarantees integration and compatibility with Intel products as well as 3rd-party hardware and software solutions. Read this article to learn how to use it.

May 21, 2012 - Cloud Bursting for Particle Tracking
ALS physicists Changchun Sun and Hiroshi Nishimura along with HPCS staff Kai Song, Susan James, Krishna Muriki, Gary Jung, Bernard Li and Yong Qin recently explored the use of Amazon's VPC service to transparently extend the ALS compute cluster and software environment, into the public Cloud to provide on-demand compute resources for particle tracking and NGLS APEX development. Their work was presented during the poster session at the International Particle Accelerator Conference (IPAC12) in New Orleans this week.

Jan 24, 2012 - Bootstrapping Institutional Capability
HPC Services Manager Gary Jung talks about the issues institutions may encounter when developing new or enhancing existing infrastructure to support data intensive science at the Winter 2012 ESCC/Internet2 Joint Techs Conference in Baton Rouge, LA this week.


Nov 3, 2011 - Supercomputers Accelerate Development of Advanced Materials

Researchers from Berkeley Lab and MIT have teamed up to develop a new tool,as part of the Materials Project, to speed up the development for new materials. The project incorporates the use of supercomputing resources including Lawrencium to characterize the properties of inorganic compounds. More

Oct 25, 2011 - Supercomputing As A Service
LBL CIO Rosio Alvarez and HPC Services Manager Gary Jung present their experiences using cloud services for HPC at InformationWeek's GovCloud 2011 conference in Washington DC. More


PROJECTS

Big Data at the ALS
We build a Data Pipeline using a Fast 400MB/s CCD, a 43,002 core GPU cluster and a 15TB Data Transfer Node with Globus Online for PI David Shapiro to do the X-ray diffraction 3D image reconstruction at Beamline 5.3.2.1. Read more here about how their project set the microscopy record by achieving the highest resolution ever.

Water-Cooled Processors!
Together with researchers in EETD, we evaluated the use of hot-water, direct-to-chip liquid cooling on a 800-core Intel IvyBridge cluster. Does it work well? Ask us.

Chemical Science Division One of our current projects is a Dell 720-core Intel IvyBridge processor cluster for DOE early career award winner Dan Haxton of the Atomic, Molecular, and Optical Theory Group

Material Science Division We recently completed a dedicated 1824-core Intel SandyBridge processor cluster, Catamount, for MSD researchers. The system will be used to investigate the development of advanced materials.

Joint Center for Artificial Photosynthesis (JCAP) We recently completed a dedicated 2944-core AMD Interlagos cluster for the researchers at JCAP, which is the nation's largest research program dedicated to the development of an artificial solar-fuel generation technology

Geologic Carbon Sequestration
A 64-node, 768-core, Condo expansion has been added to Lawrencium for PIs Quanlin Zhou and Curt Oldenburg in ESD. It will be used for modeling of geo carbon sequestration which is a major measure for climate change mitigation, and an important component of LBNL's Carbon Cycle 2.0 Initiative.