Running GEOS-Chem on AWS cloud – it is easy!¶
This project is supported by the AWS Public Data Set Program and the NASA Atmospheric Composition Modeling and Analysis Program (ACMAP).
How to request support: For any questions, bug reports, or functionality requests, please post your issue on the GitHub issue tracker. All you need is a free GitHub account. Using the issue tracker is the preferred approach because all discussions are public and can be easily found by anyone with similar problems.
How to use this documentation¶
For GEOS-Chem users, this website contains everything you need in order to use GEOS-Chem on the cloud. You will be able to finish a complete research workflow, from model simulations to output data analysis and management. If it is your first time trying GEOS-Chem, this project is perhaps your best starting point, because you don’t need to do any initial setup and the model is guaranteed to work correctly (see quick start guide). For more details about the GEOS-Chem model itself, please refer to our comprehensive user guide and wiki.
For non-GEOS-Chem-users, this documentation can be used as an introduction to AWS for scientific computing, especially for Earth science model simulations. Since all Earth science models are highly similar from a software perspective, it should be quite easy to adapt this guide for you specific use case. More than 90% of this website is about general AWS concepts and tutorials, which doesn’t require GEOS-Chem-specific knowledge. Please get a feeling of cloud computing workflow by exploring beginner tutorials and then refer to the developer guide to build your own model. Although cloud computing has a lot of potential in Earth science, it is still significantly under-utilized due to the lack of accessible tutorials for Earth science researchers. This project tries to fill this gap.
For general reference, GEOS-Chem is a Chemical Transport Model for simulating atmospheric chemical compositions. It has been developed over 20 years and is used by more than 100 research groups worldwide. The program is mainly written in Fortran 90. All model source code is distributed freely under the MIT license. Input and output data formats are mostly NetCDF, which can be analyzed easily by most languages such as Python, R and MATLAB. IDL (Interactive Data Language) has historically been the major data analysis tool but now we embrace open-source tools especially Python, Jupyter and xarray. The classic version of GEOS-Chem uses OpenMP parallelization (shared-memory, multi-threading). The MPI version of GEOS-Chem and we are now testing it in a thousand-core cloud-HPC environment.
Table of Contents¶
- Beginner tutorials
- Quick start guide for new users
- Overview of basic AWS compute and storage services
- Set up AWS Command Line Interface (AWS-CLI)
- Use S3 as major storage
- Use EBS volumes as temporary disk storage
- Use Spot Instances to reduce EC2 cost
- Notes on security groups (EC2 firewall)
- Put everything together: a complete workflow
- Advanced tutorials
- Developer guide
- AWS concepts and services in detail