GeoSciCloud: Running Data Centers in Cloud Computing Environments
Partnering with UNAVCO, IRIS was recently awarded an EarthCube Building Block grant to investigate the use of cloud computing environments for data centers the size of IRIS and UNAVCO. We will compare current computing environments at IRIS, cloud capabilities in NSF’s Extreme Science and Engineering Discovery Environment (XSEDE), as well as commercial environments offered by Amazon Web Services. While it is often stated that there are cost savings to be realized by leveraging cloud services there are many factors that could affect actual costs for centers such as UNAVCO and IRIS.
Specifically IRIS will do the following in each of the three environments, 1) IRIS’ current computing environment, 2) XSEDE, and 3) AWS.
- Deploy two years of Transportable Array data
- Deploy the entire GSN data set from 1988 to current
- Deploy the fdsn-dataselect web service
- Deploy the IRIS time-series service which performs digital signal processing
- Several other performance monitoring and detailed operational tasks
We will also gain an understanding of on-going maintenance, do performance testing, and test elasticity in the cloud systems with actual users. Ultimately we will estimate the cost-to-operate in each of the environments. We will also study reception of real time data in the cloud, investigate and support HDF5 format for seismological data, and we will also build a system that allows seamless access to IRIS and UNAVCO data for some of our primary data types.
After the testing phases are complete we will offer resources in the environment deemed as the best cloud environment to a broader cross-section of data centers from across the geosciences that may include data sets from the long tail of science.
by Tim Ahern (IRIS Data Management Center)