Status: Active Development
This project is currently in the active data engineering phase. We are working to release the first version of the standardized dataset in late 2026.

The Mission: “AI-Ready” Hurricane Data

For decades, NOAA’s Hurricane Research Division (HRD) and the National Hurricane Center (NHC) have collected invaluable inner-core observations from aircraft reconnaissance. However, this data exists in disparate formats, legacy archives, and varying standards that make it difficult to use in modern machine learning (ML) and data assimilation applications.

This project aims to unify these disparate sources into a single, standardized, and “AI-Ready” dataset.

By converting decades of raw flight-level and dropwindsonde data into modern, self-describing formats (NetCDF/HDF5), we are building the foundational infrastructure required to train the next generation of deep learning models for tropical meteorology.

The Challenge

  • Disparate Sources: Data is scattered across multiple agencies (NOAA, USAF Reserves) and archival systems.
  • Legacy Formats: Much of the historical data exists in ASCII text files or proprietary binary formats that require specialized decoding.
  • Inconsistent Metadata: Variable names, units, and quality control flags have changed repeatedly over the last 30 years.

Methodology & Goals

We are building a robust processing pipeline to standardize this data:

  1. Ingestion: Aggregating high-density observations (HDOBS), dropwindsondes, and tail-Doppler radar data.
  2. Standardization: Mapping all physical quantities to a unified schema with consistent units and CF-compliant metadata.
  3. Quality Control: Implementing automated QC flags to identify instrument errors and outliers.
  4. Distribution: Outputting the final product in cloud-optimized NetCDF/HDF5 formats accessible via Python/Xarray.

Future Access

  • Code Repository: Scripts for decoding and standardizing raw flight data will be released on GitHub.
  • Data Access: The curated dataset will be hosted on NOAA/NCEI repositories with public access.

For inquiries regarding collaboration or early access to data samples, please contact me.


← Back to Tools & Software | Go to Home