A Common Cloud Platform, building the next generation system
Common Cloud Platform project
The Common Cloud Platform (CCP) is a joint project for IRIS Data Services (DS) and UNAVCO Geodetic Data Services (GDS) to design and implement a cloud-based platform to handle the needs of both facilities by 2023. Beyond the migration of repositories and support systems, the project aims to combine these systems whenever possible. Major anticipated advantages of CCP include:
- integrated discovery and access for all geophysical data holdings
- scalable capacity, allowing easy scaling to meet demand
- hosting data in a system that offers HPC-style capability to users, removing the need to transfer data across the internet
- accommodating new data types more easily
- cost-effective operation, paying only for resources needed
The project affords an opportunity to retire technical debt, while including modernization at a low-level such as FAIR data capabilities, identity management, leveraging existing open-source software whenever possible, etc. This is also an opportunity for staff members of the two facilities to develop common practices and a jointly build a next generation platform.
Currently in the planning and evaluation stages of the project, the interests of the research community are represented by the Chairs of the IRIS DS and UNAVCO GDS governance committees as project board members.
What does transition mean for users?
We anticipate offering all of our key web services and many of the same, or enhanced, applications. This means that most current users will have little or no adjustment to make. With more capacity we expect to offer the same services, while supporting higher usage, more parallelism, etc. When and if adjustments are needed by researches, we will endeavor to make the changes as minimal and easy as possible.
The most exciting prospects of the project is the potential to enable major new services and opportunities. One-stop shopping for seismological, geodetic and other geophysical data will help users with a initial challenges of integrating these different data types. Hosting data within, or adjacent to, a system with significant computational capability will allow users to perform large volume processing without needing to transfer the data across the internet.
There are yet a lot of details to work out, with many more questions than answers at this point. But we are excited to the on this path and will keep our research community informed of the progress.
Background and motivation
For many decades the Data Management Center has done data collection, processing and archive management on hardware and software systems directly maintained by DMC staff. While this has given us flexibility and some degree of cost control, it has also imposed some limitations. Over the last 3 years the GeoSciCloud project allowed us to explore the realities of operating in a cloud-based system. We learned a number of important lessons. Most importantly that operating a data center like the DMC in the cloud is possible and affords significant potential enhancements. Also that it is complicated in different ways and that controlling costs requires new approaches.
Following a joint review of the IRIS and UANVCO data service facilities in 2019, NSF directed the two groups to form a single project to prototype a common, cloud-based system for our combined data management needs. In addition to this motivation, we would like to leverage the capabilities offered by cloud systems to provide enhanced data access that would otherwise be quite difficult or impossible to support on our self-managed systems. We expect that such advanced data access and processing needs will become increasingly important in the future. Finally, we want the anticipated merger with UNAVCO to result in a streamlined data management system that offers broad data discovery and rich access. All together, the time for starting an evolution of our data management system is now.
by Chad Trabant (IRIS DMC) and Jerry Carter (IRIS)