You are invited to our upcoming NASA Transform to Open Science (TOPS) Community Forum on September 14 at 1 p.m. ET for a webinar on eclipse events and their connection to the world of open science. We will discuss how eclipse research can contribute to collective research efforts, leading to innovative findings and a deeper comprehension of these phenomena.
We are excited to have two distinguished speakers, Mitzi Adams, assistant manager of the Heliophysics and Planetary Science Branch at NASA’s Marshall Space Flight Center and Dr. Kelly Korreck, program manager for the 2023 and 2024 Solar Eclipses and program scientist for the Heliophysics Division in the Science Mission Directorate at NASA HQ, will share their expertise and insights.
Adams will explore the data-driven side of eclipses and how open science principles facilitate the sharing and accessibility of data, which can lead to innovative findings and a deeper comprehension of eclipse events.
In addition, Dr. Korreck will discuss eclipse events and how they relate to the world of open science, with an introduction to NASA’s Heliophysics Big Year. This is a global celebration of science and the Sun’s influence on Earth and the entire solar system.
Register now to secure your spot. This event is open to the public and will include an interactive Q&A following the presentations.
Xarray is an open-source Python package that makes working with complex, multi-dimensional arrays elegant, intuitive, and efficient. Real-world datasets, such as those generated by NASA, are often a collection of many related variables on a common grid. These datasets are more than just arrays of values: they have labels which describe how array values map to locations in dimensions such as space and time and metadata that describes how the data was collected and processed. Xarray embraces the complexity of real-world datasets and enables users to use metadata such as dimension names and coordinate labels to easily analyze, manipulate, and visualize their datasets. Xarray makes data analysis more intuitive and enjoyable, while describing how data was collected and processed.
A Vital Role in Handling NASA’s Evolving Data Demands
Consider for a moment that NASA’s Science Mission Directorate (SMD) collectively stores over 100 Petabytes (PB) of data and estimates doubling that to 200 PB per year within the next five years. Handling large amounts of data at scale is clearly an important consideration as the volume of data from modern sensors continues to grow. With that said, Xarray’s flexibility has played a pivotal role in NASA’s transition to cloud computing infrastructure, ensuring efficient and robust data processing for the agency’s vast repositories of information. Xarray is a common component in workflows involving NASA datasets across many domains, including physical oceanography, and glaciology.
In 2021, NASA selected Xarray as one of eight open–source projects for funding under the Open Source Tools, Frameworks, and Libraries program. This financial support has not only allowed the Xarray project to flourish, but also to expand itself for usage of NASA data through maintenance and outreach activities (see the full proposal announcement).
Committed support from NASA has been instrumental in allowing Xarray maintainers to make major progress on long-term goals such as reorganizing the code base for long term sustainability, substantially revamping the Xarray tutorial website, and spending time to implement new features that benefit a wide number of domains . NASA’s ongoing support has also allowed maintainers to spend time on day-to-day maintenance tasks and handle user support requests more quickly. Previously, such work was performed on a volunteer basis and hard to sustain.
Xarray has used funds to help build its community through the SIParCS summer internship program at NCAR (blog), participate in conferences such as SciPy 2023, and host virtual office hours. Over the next year, the team is looking to get more involved with domain specific extensions for the needs of NASA’s remote sensing data through rioxarray, continue the office hours program, and represent Xarray at a number of additional conferences.
Xarray at SciPy 2023
Thanks to NASA funding, Xarray was able to participate in SciPy 2023 in a significant way. The tutorial at SciPy 2023 was an exciting opportunity for scientists already familiar with Xarray to delve into advanced topics. The 2023 tutorial targeted intermediate-advanced level material and built on the fundamental level tutorial delivered at SciPy 2022.
Tutorial participants reported they were able to streamline their workflows by using more of Xarray’s built-in functions after gaining insight into concepts that were initially intimidating, such as parallelizing computations on very large datasets.
The team had good turnout at Scipy “sprints”, where the Xarray community worked together with allied projects like Zarr to discuss and quickly solve problems. Emma Marshall, presented a great talk building on her 2022 SIPaRCS internship work with Xarray on how to organize tidy remote sensing datasets in a manner that facilitates easy analysis in the future.
An Open and Inclusive Community
In addition to impressive technical capabilities, one of Xarray’s greatest strengths is its vibrant and inclusive community. Xarray has been publicly developed on Github since 2014 with over 270 contributors improving upon this project through open development practices. Thanks to these active GitHub contributions, conference tutorials and virtual office hours, Xarray has garnered interest from over 10,000 active users across various scientific disciplines.
NASA funding from the OSTFL program supports Xarray maintainers from historically underrepresented groups in the fields of Earth Science and open-source software development, demonstrating a commitment to inclusivity. Tutorials and virtual office hours increase the visibility of these individuals so they can serve as role models within their communities. Xarray places high value on a diverse group of users and contributors at all levels of software development expertise in order to improve the overall quality and accessibility of the software.
If you are interested in contributing your skills and enthusiasm to the Xarray project by reporting bugs, improving documentation, suggesting enhancements, and sharing any other ideas visit the contributions page today.
Xarray’s commitment to these principles of openness and inclusivity are in close alignment with NASA’s vision of open science. At NASA, 2023 is the year of open science, and one of the core ideas of open science is that by breaking down barriers and having scientists from diverse backgrounds engage with research, scientific discoveries will be accelerated. To aid in growing the open science community, NASA is developing a curriculum to train scientists, researchers, and citizen scientists to use open science tools, like Xarray, in their research. To learn more and to pre-enroll in the curriculum, visit the Transform to Open Science (TOPS) GitHub page.
With its powerful capabilities and inclusive community, Xarray is a compelling tool that will hopefully entice readers to explore its potential in their data science endeavors at NASA and beyond. Head over to their GitHub to explore the wealth of resources and the vibrant community that has made this project what it is today.
The June TOPS Monthly Community Forum will be integrated into the bi-annual Community Panel, held from June 14 – June 16 in a hybrid format broadcasted live from NASA HQ in Washington, D.C.
The TOPS Community Panel reviews and provides input on NASA’s strategy for transitioning to open-source science. The panel meeting will bring together leaders from the open science, open source software, and data science communities with the NASA TOPS team for a detailed review of TOPS plans. The meeting will be public and have tools for the public to submit questions.
We want to hear from you! During the panel, we’ll dedicate an entire hour to answering your questions live on air. We invite you to register to attend virtually and submit your questions during the event, using our IO tool.
Where: Public (virtual) meeting When: 14-16 June 2023, 12-4 EST (9-11 PST), each day. (Full Agenda) Questions? Submit questions before and during panel using our IO tool here Discussion topics? Start a discussion on GitHub