Radars are vital in meteorology, detecting severe weather early and enabling timely warnings, saving lives, and reducing property damage. Beyond forecasting, radar data supports various applications, including for statistical analysis and climatology, relying on time series analysis. Radar scans generate large, separated files, leading to vast accumulations over decades, posing storage challenges akin to big data management. A radar volume scan, comprising data collected through multiple cone-like sweeps at various elevation angles, often exceeds several megabytes in size every 5 to 10 minutes that usually stored as individual files. Consequently, national weather radar networks accumulate vast amounts of data with non-interconnected files over extended periods, spanning several decades. This presents significant challenges in data storage and availability, particularly when treating radar data as a time-series dataset, which parallels the complexities of managing big data.
Traditionally, radar data storage involves proprietary formats that demand extensive input-output (I/O) operations,
leading to prolonged computation times and high hardware requirements. In response, our study introduces a novel d
ata model designed to address these challenges. Leveraging the Climate and Forecast Conventions (CF) format-based
FM301 hierarchical tree structure,
endorsed by the World Meteorological Organization (WMO), and Analysis-Ready Cloud-Optimized
(ARCO; Abernathey et al. 2023) formats, we developed an open data model to arrange, manage, and store radar data in
cloud-storage buckets efficiently. This approach uses a suite of Python libraries, including Xarray (Xarray-Datatree),
Xradar, Wradlib, and Zarr, to implement a hierarchical
tree-like
data model. This model is designed to align
with the new open data paradigm, emphasizing the FAIR principles (Findable, Accessible, Interoperable, Reusable).
Alfonso Ladino-Rincon, Max Grover
[!WARNING] This project is currently in high development mode. Features may change frequently, and some parts of the library may be incomplete or subject to change. Please proceed with caution.
If you are interested in running this material locally on your computer, you will need to follow this workflow:
git clone https://github.com/aladinor/raw2zarr.git
raw2zarr
directory
cd raw2zarr
environment.yml
file
conda env create -f environment.yml
conda activate raw2zarr
notebooks
directory and start up Jupyterlab
cd notebooks/
jupyter lab