Research Services deployed two new data storage services in 2013. The first, launched in February, is targeted at high-performance and high-capacity users. It is a paid service, and users bought more than 500TB of storage in the first six months. The second, launched in October, provides 3TB of standard network drive space to users with more moderate capacity and performance requirements.
Serving both researchers and scholars
In 2012, ITS-Research Services focused on operationalizing high-performance computing (HPC), piloting data storage platforms, and building good support staff. RS continued to push its mission to serve both researchers and scholars by gaining visibility, participating in campus humanities efforts, and building partnerships.
The team continues to expand as ITS is able to fund the services RS provides for the campus. Three full-time positions were added to support HPC, cyberinfrastructure, and Galaxy, a web-based platform for data-intensive biomedical research. Overall, HPC consumed about 75% of RS efforts.
In 2013, RS focused on deployment of two new data storage services and on building the next-generation HPC cluster, Neon. Contributions to the campus conversation about cyberinfrastructure and informatics were also a focus.
Plans for the year ahead
For 2014, the focus will be on improving support and awareness of existing services, improving the research computing community, and making the Neon HPC cluster a productive tool for the campus.
Building on the success of its first high-performance computing (HPC) cluster, the University of Iowa has now built a second “supercomputer” that became available to researchers in late 2013.
The UI’s first HPC cluster, Helium, came online in 2011, and with rapid increase in usage by UI researchers is now operating near capacity, at more than 90 percent utilization. In Sept. 2013, the state Board of Regents approved the purchase of the new system, Neon. The addition of Neon significantly boosted the university’s HPC capacity from 3,700 to 6,132 processor cores.
From years to days
The decision to invest in Neon is based on the tremendous success we've experienced with Helium,” says P. Barry Butler, executive vice president and provost. “By matching central resources with faculty-generated grants we've been able to build a shared, high-performance computing environment that's available to all faculty, whether they have grant funding or not. The efficiency of this approach to shared investment is something I think the university should be very proud of.”
HPC supports research that would not otherwise be possible at the UI, in some cases condensing the time required to run simulations from years to days by using multiple processors that work on single or multiple computational tasks at the same time. UI investigators use Helium for a wide variety of projects, like modeling flood scenarios to help Iowa communities make decisions about mitigation, understanding how uterine cancer develops, and modeling climate change and airflow in the lungs.
Twenty-five research groups from seven UI colleges have already invested $700,000 in the new system. The university centrally funded the remaining $500,000 of Neon’s total $1.2 million cost. The system has a safe, secure home at the UI’s new Information Technology facility, and is administered and maintained by the ITS Research Services team.
History and rapid growth of HPC at the UI
HPC got its start at the UI two years ago, when the Institute for Clinical and Translational Science, IIHR-Hydroscience and Engineering, and ITS-Research Services teamed up to launch Helium.
The initial investment that made Helium possible came from a faculty member in the College of Engineering, and investments from a dozen other researchers followed, providing the startup funds. A centrally funded expansion of the system in fall 2011 opened it up to users across the institution.
Users from all across campus
Use of Helium increased dramatically in its first year. By the end of 2012, the system had provided 2,200 compute years of service, up from just 250 its first year. The number of users soared from about 50 in 2011 to over 200 in 2012 and then on to 450 users by the end of 2013, and the user base now represents over 60 units from all across campus.
Today Helium consists of 3,700 processor cores and over 500 terabytes of data storage. It provides over 2 million compute hours per month to UI researchers. At current market rates, utilizing external HPC resources such as Amazon’s compute service would cost the UI over $300,000 per month.
The installation of Neon, built using the latest generation of compute, storage, and network hardware, will help meet the increased demand for HPC capabilities on campus by providing an additional 2,432 processor cores, 1,440 co-processor cores, 7,488 Graphics Processing Unit (GPU) cores, access to large amounts of memory, and 216 terabytes of storage capacity.
Predicting flood impact for the state of Iowa
More than 1,600 communities across the state are at risk for flooding, and researchers at the UI-based Iowa Flood Center are using high-performance computing (HPC) to improve flood monitoring and prediction. With complex mathematical algorithms, the HPC cluster produces projections every 15 minutes of what river levels in each town will be for the next five days.
Severe flooding of the Iowa and Cedar Rivers in 2008 was the catalyst for the center, established with support from the National Science Foundation and the state legislature. A portion of the funding was invested in Helium. A website provides access to the Iowa Flood Information System, which features inundation maps, real-time flood information, forecasts, and interactive visualizations.
Prediction models mimic the aggregation of water, taking into account rainfall, water levels, soil type, information about land use and vegetation, and where crops are in the growth cycle. Different models also factor in unknowns that influence flooding—for example, researchers do not have data on saturation, which means it’s hard to say how much of the rain will infiltrate to the rivers.
Contributing to the challenge of predicting river levels are the state’s size and the need to produce timely forecasts. Iowa consists of over 56,000 square miles of land and 3 million water pathways, and new rainfall data comes through every five minutes. HPC is necessary to make calculations for such an immense area at a pace fast enough to keep up with rapidly changing circumstances.
There is no operational system in the world as detailed as ours,” says Flood Center Director Witold Krajewski, a professor of civil and environmental engineering at the UI. “It’s fair to say that if not for the HPC cluster, we wouldn’t be doing the work we are doing.”
Modeling climate change
One UI researcher using HPC is Pablo Saide, who is pursuing a PhD. in environmental engineering. He is from Santiago, Chile, where air pollution is a major concern. The surrounding Andes Mountains and seasonal weather conditions cause vehicle and industrial emissions to linger, and the local government uses predictions to declare air pollution episodes. If an episode is imminent, measures are taken to reduce smog and people are encouraged to stay indoors. But often it’s too little too late.
Forecasts are conducted 24 hours in advance, but by waiting until an episode is imminent, the efforts to prevent the episode don’t do much good,” he says. “Hospitals are full of people with asthma and kids. And people develop cancer and long-term health problems because of the pollution.”
With HPC, Saide is developing computer model simulations that can forecast air pollution three days in advance. Data for the simulations comes from measuring stations around the city that monitor how the plume moves, horizontally and vertically. The air-quality measurements, along with meteorological and forecasting data, feed into models that can be quickly processed by Helium. This enables scientists to quickly generate high-resolution simulations to predict pollution.
Understanding Huntington disease
HPC is also being used by UI neuroscientists as they study changes in the brains of Huntington disease (HD) patients before they begin to experience symptoms. The research couldlead to earlier interventions for people diagnosed with the hereditary disorder, which causes widespread brain tissue atrophy, interfering with mobility, memory, speech, and mood.
The PREDICT-HD study involves 1,500 research subjects worldwide. Researchers analyze brain scans from the patients over a 10-year period and apply algorithms to extract measurements that quantify the progression of the disease. Measurements include changes in brain volume, tissue composition, structural size, anatomical regions, and cortical depth. Researchers look at how changes in different regions of the brain correlate with psychiatric, behavioral, and cognitive measures.
Testing each algorithm’s effectiveness takes more than 42 hours of computation per imaging scan session. There are 4,400 data sets to test with each method, and many parameters to modify.
Testing the algorithms on a single computer would take two or three years of data processing,” says Hans Johnson, Ph.D., an assistant professor of psychiatry. “Helium allows us to do that in one day.”
For more stories on HPC research and details on UI HPC resources, visit http://hpc.uiowa.edu.
Deployed Data Storage Services
RS continues to build collaborations all across campus. A partnership with the Iowa Institute for Human Genetics led to the deployment of Galaxy, a web-based platform for data-intensive biomedical research that enables non-bioinformaticians to create, run, tune, and share their analyses.
Collaboration with EPSCoR, the Experimental Program to Stimulate Competitive Research, led to the hiring of a new employee to provide cyberinfrastructure support. He works with colleagues at Iowa State University and in ITS to support researchers working on renewable energy initiatives.
RS also participates in Digital Studio for Public Humanities meetings, began engagement on a campus GIS initiative, and formed a Research Computing group of about a dozen members from various areas of the UI that collaborates on maintenance of Helium and other research computing projects. In addition, RS spearheaded an eResearch advisory group that provides feedback and guidance on cyberinfrastructure (CI) initiatives and developed recommendations for improved CI support.
Helium funding and ITS investments allowed for more training in HPC and related areas, such as an introduction to computing on Helium, and working with Infiniband, a high-speed networking technology. Opportunities to leverage national curricula such as those provided by Xsede, the National Science Foundation national network of high-end cyberinfrastruture providers, or the EPSCoR Cyberinfrastructure program are also being utilized.
Enhancements to Helium in 2012 included improved storage performance, data transfer bandwidth, and stability. RS also made increases in the amount of publicly available scratch storage, as well as software package deployments to support the broader user base. In addition, staff members improved documentation and system statistics collection and reporting, deployed GridFTP, and registered Helium with a national data transfer service, Globus Online.
HPC Users by College
The user base for the Helium HPC cluster represents more than 60 units from all across campus. The College of Engineering and the Carver College of Medicine represent the highest percentage of total users, followed by the College of Liberal Arts and Sciences.
|Category||Business||Engineering||Graduate College||Liberal Arts & Sciences||Medicine||Pharmacy||Public Health||VPR||Other|
|Percent of Total Users||5||31||4||21||27||1||3||4||4|
|Percent of Total System Time||0||57||1||4||22||0||6||10||0|
Years CPU Time/Quarter
One measure of HPC use is the number of years of Central Processing Unit (CPU) time used. (This is how long the calculations would take to perform if a single CPU system were used; for example, to perform 100 years of CPU time on a dual core desktop system would take about 50 years.) ITS has observed significant growth in the CPU hours used by UI researchers since the implementation of HPC. During the first quarter of 2011, about 61 years of CPU time were used, but during the final quarter of 2013, more than 678 years were used.
|Category||Q1 2011||Q2 2011||Q3 2011||Q4 2011||Q1 2012||Q2 2012||Q3 2012||Q4 2012||Q1 2013||Q2 2013||Q3 2013||Q4 2013|
|Years of CPU Time||61||215||197||315||382||475||561||783||723||718||783||678|
Additional High-Performance Computing Metrics
- ~50 users
- 1600 Processor Cores
- 220TB Storage
- Largest Memory Node - 24GB
- 8 Racks
- External Bandwidth - 4Gb
- 250 Years of Compute Time
- Over 200 users
- 3900 Processor Cores
- 500TB Storage
- Largest Memory Node - 144GB
- 16 Racks
- External Bandwidth - 10Gb
- 2200 Years of Compute Time
- 450 users
- 6300 Processor Cores
- 1400TB Storage
- Largest Memory Node - 512GB
- 23 Racks
- External Bandwidth - 10Gb
- 5890 Years of Compute Time