NSF Unidata
nonprofitBoulder, United States
Research output, citation impact, and the most-cited recent papers from NSF Unidata. Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from NSF Unidata
Abstract Artificial intelligence (AI) and machine learning (ML) pose a challenge for achieving science that is both reproducible and replicable. The challenge is compounded in supervised models that depend on manually labeled training data, as they introduce additional decision‐making and processes that require thorough documentation and reporting. We address these limitations by providing an approach to hand labeling training data for supervised ML that integrates quantitative content analysis (QCA)—a method from social science research. The QCA approach provides a rigorous and well‐documented hand labeling procedure to improve the replicability and reproducibility of supervised ML applications in Earth systems science (ESS), as well as the ability to evaluate them. Specifically, the approach requires (a) the articulation and documentation of the exact decision‐making process used for assigning hand labels in a “codebook” and (b) an empirical evaluation of the reliability” of the hand labelers. In this paper, we outline the contributions of QCA to the field, along with an overview of the general approach. We then provide a case study to further demonstrate how this framework has and can be applied when developing supervised ML models for applications in ESS. With this approach, we provide an actionable path forward for addressing ethical considerations and goals outlined by recent AGU work on ML ethics in ESS.
Presentation on the early history of ACDD given on 20 January 2026 during the "Future of ACDD" session at the ESIP January 2026 Meeting.
What's Changed Remove fake author by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/540 update contributing guide by @jukent in https://github.com/ProjectPythia/pythia-foundations/pull/539 Todo typo by @bl-freeman in https://github.com/ProjectPythia/pythia-foundations/pull/542 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/544 Replace dead links in GitHub Foundations chapter by @bl-freeman in https://github.com/ProjectPythia/pythia-foundations/pull/545 Update citations by @bl-freeman in https://github.com/ProjectPythia/pythia-foundations/pull/548 Fix foundations links by removing '.html' by @jukent in https://github.com/ProjectPythia/pythia-foundations/pull/550 Add JOSE status badges by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/551 Fix remaining links by @kafitzgerald in https://github.com/ProjectPythia/pythia-foundations/pull/552 add analytics for foundations by @jukent in https://github.com/ProjectPythia/pythia-foundations/pull/554 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/555 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/556 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/557 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/558 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/559 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/560 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/561 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/562 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/563 Add content links in matplotlib.md by @kafitzgerald in https://github.com/ProjectPythia/pythia-foundations/pull/581 Add Brittany Freeman to author list by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/584 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/585 Address build warnings and fix links by @kafitzgerald in https://github.com/ProjectPythia/pythia-foundations/pull/564 Add Brittany and Katelyn to the CITATION.cff file by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/586 Add additional citations by @kafitzgerald in https://github.com/ProjectPythia/pythia-foundations/pull/565 Add central glossary, additional abbreviations, links, and minor editorial changes by @kafitzgerald in https://github.com/ProjectPythia/pythia-foundations/pull/582 Add more links by @r-ford in https://github.com/ProjectPythia/pythia-foundations/pull/589 Add citation to FAIR-4-Research-Software paper by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/588 No node in conda environment by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/594 Add min version for mystmd by @kafitzgerald in https://github.com/ProjectPythia/pythia-foundations/pull/596 Fix: DOI badge by @agoose77 in https://github.com/ProjectPythia/pythia-foundations/pull/592 Update Numpy Basics notebook by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/600 Add outline to NumPy overview page by @r-ford in https://github.com/ProjectPythia/pythia-foundations/pull/602 Replace old HTML admonitions with proper MyST syntax throughout the book by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/601 Fix links in NumPy overview by @r-ford in https://github.com/ProjectPythia/pythia-foundations/pull/603 Fix Zenodo badge on landing page by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/598 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/610 Fix broken links by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/609 Improved authorship credits by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/612 Update Pandas notebook for Pandas 3.0 by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/607 Update Overview section by @r-ford in https://github.com/ProjectPythia/pythia-foundations/pull/613 Add author Lily Kailyn with ORCID and affiliation by @brian-rose in https://github.com/ProjectPythia/pythia-foundations/pull/615 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/616 Update timezone content by @dcamron in https://github.com/ProjectPythia/pythia-foundations/pull/620 Address nightly build failure by @jukent in https://github.com/ProjectPythia/pythia-foundations/pull/618 [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci[bot] in https://github.com/ProjectPythia/pythia-foundations/pull/619 New Contributors @bl-freeman made their first contribution in https://github.com/ProjectPythia/pythia-foundations/pull/542 Full Changelog: https://github.com/ProjectPythia/pythia-foundations/compare/v2025.06.25...v2026.03.30
This poster describes Project Pythia's annual community hackathons, aka Cook-offs. These summer sprints blur the lines between scientist and software developer at an individual and group level, and seed excitement and commitment to the open source, open science community.
Project Pythia is the educational arm of the Pangeo community, and provides a growing collection of community driven and developed training resources that help geoscientists navigate the Pangeo ecosystem, and the myriad complex technologies essential for today’s Big Data science challenges. Project Pythia began in 2020 with the support of a U.S. NSF EarthCube award. Much of the initial effort focused on Pythia Foundations: a collection of Jupyter Notebooks that covered essential topics such as Python language basics; managing projects with GitHub; authoring and using “binderized” Jupyter Notebooks; and many of Pangeo’s core packages such as Xarray, Pandas, and Matplotlib. Building upon Foundations, the Pythia community turned its attention toward creating Pythia Cookbooks: exemplar collections of recipes for transforming raw ingredients (publicly available, cloud-hosted data) into scientifically useful results. Built from Jupyter Notebooks, Cookbooks are explicitly tied to reproducible computational environments and supported by a rich infrastructure enabling collaborative authoring and automated health-checking – essential tools in the struggle against the widespread notebook obsolescence problem. Open-access, cloud-based Cookbooks are a democratizing force for growing the capacity of current and future geoscientists to practice open science within the rapidly evolving open science ecosystem. In this talk we outline our vision of a sustainable, inclusive open geoscience community enabled by Cookbooks. With further support from the NSF, the Pythia community will accelerate the development and broad buy-in of these resources, demonstrating highly scalable versions of common analysis workflows on high-value datasets across the geosciences. Infrastructure will be deployed for performant data-proximate Cookbook authoring, testing, and use, on both commercial and public cloud platforms. Content and community will expand through annual workshops, outreach, and classroom use, with recruitment targeting under-served communities. Priorities will be guided by an independent steering board; sustainability will be achieved by nurturing a vibrant, inclusive community backed by automation that lowers barriers to participation.
Poster presented on 16 December 2025 at the AGU Annual Meeting to the "Evolving Science Commons - Poster" session. Abstract:The CF (Climate and Forecast) Conventions are a community-developed standard for describing Earth system science data in the netCDF data format (and more recently in Zarr/GeoZarr). The CF Conventions can encode information that describes the coordinate systems, data structure, and geophysical meaning and units of each variable, and how the data were collected. It is widely used by weather and climate scientists and remote-sensing researchers and is gaining traction in new communities, such as biogeochemistry and operational weather prediction. It has a mature ecosystem of FOSS (Free and Open-Source Software) and commercial software tools that can explore, analyze, and visualize data that are encoded using the CF Conventions. This presentation will provide a high-level overview of CF, including governance and the CF data model, and review discussions and outcomes from the 2025 CF Workshop held virtually 22-25 September 2025. These discussions included how CF should evolve in terms of representing discovery metadata and provenance, vocabulary management, support for localized metadata, and use in machine learning models.
Presentation on the early history of ACDD given on 20 January 2026 during the "Future of ACDD" session at the ESIP January 2026 Meeting.
Poster presented on 16 December 2025 at the AGU Annual Meeting to the "Evolving Science Commons - Poster" session. Abstract:The CF (Climate and Forecast) Conventions are a community-developed standard for describing Earth system science data in the netCDF data format (and more recently in Zarr/GeoZarr). The CF Conventions can encode information that describes the coordinate systems, data structure, and geophysical meaning and units of each variable, and how the data were collected. It is widely used by weather and climate scientists and remote-sensing researchers and is gaining traction in new communities, such as biogeochemistry and operational weather prediction. It has a mature ecosystem of FOSS (Free and Open-Source Software) and commercial software tools that can explore, analyze, and visualize data that are encoded using the CF Conventions. This presentation will provide a high-level overview of CF, including governance and the CF data model, and review discussions and outcomes from the 2025 CF Workshop held virtually 22-25 September 2025. These discussions included how CF should evolve in terms of representing discovery metadata and provenance, vocabulary management, support for localized metadata, and use in machine learning models.