This code provides a systematic approach to handling and analyzing large-scale weather datasets stored in the .nc format.
See the article in the description for more informations.
Ensure you have the following Python packages installed:
os netCDF4 numpy pandas datetime sklearn xarray multiprocessing tqdm joblib
The script performs the following operations:
- Loads data from
.ncfiles located in a specified directory. - Processes and merges these datasets.
- Validates the time range of data.
- Splits the data into training and testing datasets.
- Applies Partial Least Squares (PLS) Regression.
- Transforms and validates the resultant data.
- Finds the most similar data points using parallel computation.
time_interval_reconstruction = ["2018-01-01 00:00:00", "2019-12-31 23:54:00"]
predict_variable = 'WSPDchyv2_2011_2019'
k = 3
folder_path = '/Users/Murilo/weather_data'
netCDF_resolution = 1