Published November 6, 2025
| Version
v1
Dataset
Open
Data for manuscript "Analysis of convective cell development with split and merge events using a graph-based methodology" by Ritvanen et al.
Creators
- 1. Finnish Meteorological Institute
Description
# Data for manuscript "Analysis of convective cell development with split and merge events using a graph-based methodology" by Ritvanen et al.
This repository contains data for the manuscript "Analysis of convective cell development with split and merge events using a graph-based methodology" by Ritvanen et al. submitted to Atmospheric Measurement Techniques.
The following data is provided:
- `cell_graph_database_v20251105.tar`: a database dump file containing the database with cell track data
- `cell_subgraph_data_v20251105.tar`: a tar file containing subgraph data for all split and merge events. The file contains daily parquet files for each day in the study period.
- `numerical_figure_data_v20251105.tar`: a tar file containing data files with numerical data used to create the figures in the manuscript.
- `README_data.md`: this file describing the data files.
## Description of database file
The database dump was created with the command
```bash
pg_dump --no-owner --no-privileges -C --format=t --blobs --verbose --user --file ""
```
Restoring the database can be done with the commands
```bash
createdb -O
pg_restore -vxOW --role= -U -d
```
Note that this assumes you have PostgreSQL with PostGIS installed and a PostgreSQL user with the given username exists and has permission to create databases.
## Description of subgraph data files
The subgraph data tar file contains daily parquet files for each day in the study period. Each parquet file contains subgraph data for all split and merge events that occurred on that day. The columns in the parquet files are:
- `method`: cell tracking method used (only 'opencv_vil_1.0:minArea_10:clusters_0' in this data)
- `type`: type of event ('split', 'merged', 'split-merge')
- `identifier`: identifier of the cell, unique in combination with `timestamp` and `method`
- `t0_node`: identifier of the t0 / event node of the subgraph
- `timestamp`: timestamp of the cell
- `level`: level of the cell from the event node (0 = event node, -1 = one timestep before event, 1 = one timestep after event, etc.)
- `area`: area of the cell in km^2
- `event`: sting descriptor of cell tracking status
- `num_cells_at_level`: number of cells at the given level in the subgraph
- `t0_time`: timestamp of the event node
For examples of how to read and use the subgraph data, see the Jupyter notebook `notebooks/plot_article_figures.ipynb` in the code provided with manuscript
## Description of files containing numerical version of figures
- Figure 7:
- `fig7a_split_merge_num_cells_histogram.csv`: fraction of events as a function of number of cells participating in splits and merges. Columns: 'num_cells' (number of cells), ´type´ (split or merged), ´fraction_of_events´ (fraction of events with given number of cells).
- `fig7b_split_merge_num_cells_histogram.csv`: fraction of events as a function of number of cells participating in splits and merges in merge-split events. First row contains number of merging cells and first column number of splitting cells. Values are fraction of events with given number of merging and splitting cells.
- Figure 8:
- `fig8_split_merge_cell_area_histograms.json`: JSON file containing x and y values for cell area histograms for split, merge, and merge-split event and cells involved in split and merge events. The structure is:
```json
{
"split": {
"area": {
"": {
"x": [],
"y": []
},
...
},
...
}
```
The first level keys are 'split', 'merged', 'split-merge', 'splitted', 'merging' corresponding to different event and cell types. The second level keys are 'area' (cell area). The third level keys are group labels used in the histograms. Each group label contains x and y values for the histogram. Note that the x, y values correspond to the linepoints used to draw the histogram (not the bar heights).
- Figure 9:
- `fig8_split_merge_area_ratio_histograms.json`: JSON file containing x and y values for area ratio histograms for split and merged cells. The structure is:
```json
{
"split": {
"": {
"x": [],
"y": []
},
...
},
"merged": {
"": {
"x": [],
"y": []
},
...
}
}
```
where `` is a string representation of the area interval used in the histogram (e.g., "(0, 1000]").
- Figure 10:
- `fig10_trajectory_development_split_merge_vil_thr_20.json`: JSON file containing x and y values for trajectory development plots for split and merged cells with maximum VIL threshold of 20 dBZ. The structure is:
```json
{
"": {
"x": [_ci": {
"bottom_x": [` is the title of the variable plotted (e.g., "Total cell Area") and `` is the title of the subplot.
- `fig10_trajectory_development_split_merge_all_available_between_min_-3_max_0_max_vil_thr_20.json`: same as above but for all available cells between timesteps -3 and 0 with maximum VIL threshold of 20 dBZ.
- `fig10_trajectory_development_split_merge_all_available_between_min_-3_max_6_max_vil_thr_20.json`: same as above but for all available cells between timesteps -3 and 6 with maximum VIL threshold of 20 dBZ.
- Figure C1:
- `figC1_trajectory_development_split_merge.json`: same structure as Figure 10 but without any VIL threshold applied.
- `figC1_trajectory_development_split_merge_all_available_between_min_-3_max_0.json`: same as above but for all available cells between timesteps -3 and 0.
- `figC1_trajectory_development_split_merge_all_available_between_min_-3_max_6.json`: same as above but for all available cells between timesteps -3 and 6.
Files
README_data.md
Files
(2.4 GB)
| Name | Size | Download all |
|---|---|---|
|
Checksum: md5:a338de5164ecc9d22478faa4f6b530de
PID: http://hdl.handle.net/11304/59a587d4-4f24-4ef5-a617-8125d4a3e1fe |
2.4 GB | Download |
|
Checksum: md5:203d4ef1a25e250a131af1265fe15826
PID: http://hdl.handle.net/11304/703310af-35fb-431e-a09c-1c19790480c0 |
6.0 MB | Download |
|
Checksum: md5:04aacc7bfa8a0214d7dcafa2ba187425
PID: http://hdl.handle.net/11304/932fec26-d75a-4b2e-9d2f-23088a328b84 |
2.0 MB | Download |
|
Checksum: md5:959425c66460e368d370e85de3fd53cb
PID: http://hdl.handle.net/11304/bb651cc9-b6e2-4d1f-8a8d-cecaae5c5d29 |
6.8 kB | Preview Download |
|
Checksum: md5:c0c27f3a1e0fd9b3ab50bf6158c1dd8f
PID: http://hdl.handle.net/11304/52bbfdfb-cef3-471b-97e4-37afb8124aad |
6.3 kB | Preview Download |
Additional details
Identifiers
- URL
- https://etsin.fairdata.fi/dataset/c65e6372-44e6-4091-97c3-dbe01a6e492e
- b2rec
- ac2197da4a034d21bee1fd9cb75ecfaf
Funding
- Vilho, Yrjö and Kalle Väisälä Fund of the Finnish Academy of Science and Letters
- Research Council of Finland project PINCAST (grant no. 341964)
- Swiss Science Foundation project no. CRSII5 201792 J.K.
FMI metadata
- Link to external data location (URL)
- https://doi.org/10.5281/zenodo.17540363
- Parameter
-
- Parameter name: rain rate
- Parameter unit: mm/h
- Parameter description: rain rate / rainfall intensity measured by weather radar
- Parameter name: vertically integrated liquid
- Parameter unit: kg/m^2
- Parameter description: radar-derived liquid water amount in a vertical column
- Parameter name: Zdr column height
- Parameter unit: m
- Parameter description: Height of 1dB Zdr contour above environmental zero level
- Data levels (meter, hectoPascal, degree, sigma pressure levels, other) in vertical direction(+/-) for example 1500 m or 850 hPa
-
- Level: surface
- Topic category
- climatologyMeteorologyAtmosphere
Temporal Coverage
Ranges:
Start date: 2021-04-30
End date: 2023-09-29
End date: 2023-09-29