Skip to content

the-franks/atmoseer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AtmoSeer

About

This project provides a pipeline to build rainfall forecast models using 1D Convolutional Neural Networks. The pipeline can be configured with different meteorological data sources.

Install

In the root directory of this repository, type the following command (you must have conda installed in your system):

./setup.sh

Project pipeline

The project pipeline is defined as a sequence of three steps: (1) data import, (2) data pre-processing and (3) model training. These steps are implemented as Python scripts in the ./src directory.

Data import scripts

All datasets generated by the above scripts described in this section will be stored in the ./data folder.

Script import_cor.py

This script has four command line arguments:

  • -s or --sta that define which station will be selected. You have to provide the weather station of interest by name: alto_da_boa_vista, guaratiba, iraja, jardim_botanico, riocentro, santa_cruz, sao_cristovao, vidigal.
  • -a or --all which if filled with 1 indicates that they will be importing the data of all stations.
  • -b or --begin and -e or --end which can be filled with the interval of years for importing the data (The default interval for data import period is from 1997 to 2022).

Example 1:

python import_ws_cor.py -s são_cristovao

The above command imports the São Cristóvão station dataset into the project data folder.

Example 2:

python import_ws_cor.py -a 1 -b 2000 -e 2015

The above command imports all the stations in the period from 2000 to 2015.

Script import_inmet.py

This script has four command line arguments:

  • -s or --sta, which defines which station will be selected. You must provide the weather stations using their code. The possible codes are A652 (Forte de Copacabana), A636 (Jacarepagua), A621 (Vila Militar), A602 (Marambaia).
  • -a or --all which if filled with 1 indicates that data from all stations will be imported.
  • -b or --begin and -e or --end which can be filled with the interval of years for importing the data (The default interval for data import period is from 1997 to 2022).
  • -t defines the INMET API token to be used to access data.

Example 1:

python import_inmet.py -s A652 -api_token <token_string>

The above command imports the observations from station with code A652, saving the dowloaded content to a file named 'A652_1997_2022.csv'

Example 2:

python import_inmet.py -a 1 -b 1999 -e 2017 -api_token <token_string>

The command imports the observations from all stations between 1999 to 2017 will be imported.

Script import_rad.py

This script has two command line arguments:

  • -b or --begin and -e or --end which can be filled in with the year interval for data import (The default interval for data import period is from 1997 to 2022).

When running it the Galeão Airport radiosonde observations dataset will be imported.

Example:

python import_sounding.py

The above command imports Galeão Airport radiosonde (SGBL) observations into the project data folder.

Script index_rad.py

Script gen_sounding_indices.py

This script has no arguments. It will generate the atmospheric instability indexes for the data imported by the script import_sounding.py. Data from the SBGL radiosonde (located at the Galeão Airport, Rio de Janeiro - Brazil) will be used to calculate atmospheric instability indexes, generating a new dataset in CSV format. This new dataset contains one entry per sounding probe. SBGL sounding station produces two probes per day (at 00:00h and 12:00h UTC). Each entry contains the values of the computed instability indices for one probe. The following instability indices are computed:

  • CAPE
  • CIN
  • Lift
  • k
  • Total totals
  • Show alter

Usage example:

python gen_sounding_indices.py

Script preprocessing.py

Pre Processing

The preprocessing script is responsible for performing several operations on the original dataset, such as creating variables or aggregating data, which can be interesting for model training and its final result. To run the preprocessing script you need to run the Python pre_processing.py command. The pre_processing code has 3 possible arguments, with only the first being required.

The arguments are:

  • -f or --file Mandatory argument, represents the name of the data file that will be used as a base for the model. It must be the same as the name of one of the files present in the Data folder of the project.
  • -d or --data Defines the data sources that will be used to assemble the dataset. The possible options are the following:
    • 'E': Weather station only
    • 'E-N': Weather station and numerical model
    • 'E-R': Weather station and radiosonde
    • 'E-N-R': Weather station, numerical model, and radiosonde
  • -n or --neighbors Defines how many nearby meteorological stations will be used to enrich the dataset

Usage example:

Python pre_processing.py -f 'RIO DE JANEIRO - FORTE DE COPACABANA_1997_2022' -d 'E-N-R' -s 5'

Usage example:

python preprocessing.py -f 'RIO DE JANEIRO - FORTE DE COPACABANA_1997_2022' -d 'E-N-R' -s 5'

The above command creates a dataset considering the Forte de Copacabana station as center, with the aggregation of data from the 5 nearest meteorological stations, using the data sources: numerical model and radiosonde.

The above command creates a dataset considering the Forte de Copacabana station as center, with the aggregation of data from the 5 nearest meteorological stations, using the data sources: numerical model and radiosonde.

Model training

The model generation script is responsible for performing the training and exporting the results obtained by the model after testing. It can be executed through the command Python creates_modelo.py, which needs two arguments -f or -file which receives the name of one of the datasets generated from pre-processing and -r or --reg which defines the architecture that will be used. Execution Example:

python creates_modelo.py -f 'RIO DE JANEIRO - FORTE DE COPACABANA_E-N-R_EI+5NN'

An ordinal classification model will be created based on the already processed dataset from the Forte de Copacabana station.

python creates_modelo.py -f 'RIO DE JANEIRO - FORTE DE COPACABANA_E-N-R_EI+5NN' -r 1

A regression model will be created based on the already processed data set of the Forte de Copacabana station

System test example

Import : python import_ws_inmet.py -s A652

Pre processing : python pre_processing.py -f 'RIO DE JANEIRO - FORTE DE COPACABANA' -d 'E-N-R' -s 5

Model generation : python creates_modelo.py -f 'RIO DE JANEIRO - FORTE DE COPACABANA_E-N-R_EI+5NN' -r 1

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.9%
  • Python 1.1%