Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: forecast and reanalysis models - update ECMWF weather model variables dictionary #736

Merged
merged 11 commits into from
Nov 23, 2024

Conversation

WilliamArmst
Copy link
Contributor

@WilliamArmst WilliamArmst commented Nov 23, 2024

Pull request type

Bugfix

Checklist

  • Lint (black rocketpy/ tests/) has passed locally
  • All tests (pytest tests -m slow --runslow) have passed locally
  • CHANGELOG.md has been updated (if relevant)

Current behavior

"time": "time" and "level": "level" were throwing key errors when trying to read those values from the dictionary.

New behavior

Behaves as expected

Breaking change

  • Yes
  • No

Additional information

You can verify this by running the following script with any ECMWF weather model (my file: ECMWF-WeatherModel.zip).

#!/usr/bin/env python3

import netCDF4
data = netCDF4.Dataset("data_stream-oper.nc")
print(f"data.variables.keys(): {data.variables.keys()}")

the output should be: data.variables.keys(): dict_keys(['number', 'valid_time', 'pressure_level', 'latitude', 'longitude', 'expver', 'z', 't', 'u', 'v'])

@WilliamArmst WilliamArmst requested a review from a team as a code owner November 23, 2024 17:58
@WilliamArmst WilliamArmst marked this pull request as draft November 23, 2024 17:59
@WilliamArmst
Copy link
Contributor Author

I'm not sure why #735 is in this PR, I thought I updated everything but ig I don't know git as well as I thought I did.

@WilliamArmst WilliamArmst marked this pull request as ready for review November 23, 2024 18:06
@Gui-FernandesBR
Copy link
Member

@WilliamArmst PR looks great. Thanks for helping us again today.

I am afraid the code corrections you have made will make the code stop working for old data, that's why we see the tests failing in the CI.

I think we should make a fallback capture of these variables: first try to get "time", if not possible, use "time" instead, if not possible again, then just give up.

This PR should solves #718 PR.

Copernicus Data Store has introduced a breaking change in their files but we - unfortunately - did not receive any heads up as final users.

@WilliamArmst
Copy link
Contributor Author

Ah ok, I'll see if I can do that.

@Gui-FernandesBR
Copy link
Member

@WilliamArmst there are multiple ways we could solve this problem.

Some options I thought:

  1. Create a ECMWFv0 dictionary, so after we try running ECMWF dictionary and finding a problem, we search for the ECMWFv0 dict
  2. Instead of defining a direct map with ECMWF, maybe we could set lists as the values. Something like: `"time": ["time", "valid_time"], then we adjust the code to accept the first key that works out with the file. I thought about creating a simple function like the following:
def get_first_matching_key_value(dictionary, keys: list[str]):
    """Given a dictionary and a list of candidate keys, return the first key and
    value that are in the dictionary.
    """
    for key in keys:
        if key in dictionary:
            return key, dictionary[key]
    raise KeyError(f"None of the keys {keys} are in the dictionary.")

Anyways, I'm open to other suggestions too. Btw it would be really nice if you could add a light (less than a few Megabytes) and new .nc file so we could create a new testing case or this new file structure. Tests on CI must be working before we merge.

Let me know if you need any other help!

@WilliamArmst
Copy link
Contributor Author

I have no clue how these tests work, but just from loading the file with all of the correct position and time parameters at data/weather/ndrt_2020_weather_data_ERA5.nc in my own code, it seems to work fine. Any ideas?

@Gui-FernandesBR
Copy link
Member

Gui-FernandesBR commented Nov 23, 2024

I have no clue how these tests work, but just from loading the file with all of the correct position and time parameters at data/weather/ndrt_2020_weather_data_ERA5.nc in my own code, it seems to work fine. Any ideas?

Thx for the fast response, I'm trying to debug it locally right now.
It seems like your last commit really solved the problem.

What is the size of your file, do you mind sharing it with us so we can create a new test case to ensure code works for both new and old .nc data? If it is more than 10 Mb, nevermind.

@Gui-FernandesBR Gui-FernandesBR linked an issue Nov 23, 2024 that may be closed by this pull request
Copy link

codecov bot commented Nov 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.97%. Comparing base (ddf71f9) to head (fa9725e).
Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #736      +/-   ##
===========================================
+ Coverage    75.96%   75.97%   +0.01%     
===========================================
  Files           95       95              
  Lines        11022    11027       +5     
===========================================
+ Hits          8373     8378       +5     
  Misses        2649     2649              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@Gui-FernandesBR
Copy link
Member

Tests are passing both locally and on CI, including slow tests.

Before merging, I would just add a new .nc file to ensure that we prevent the same error from happening again in the future.

Great work, @WilliamArmst !!!

@WilliamArmst
Copy link
Contributor Author

The file I've been using: data_stream-oper.zip (19.21 MB). 2024, March & April, 18th - 24th, 11:00 & 12:00. For FAR in California and Tripoli Gerlach in Black Rock Desert

New file that should be able to be used with
tests\acceptance\test_ndrt_2020_rocket.py: ndrt_2020_weather_data_ERA5_new.zip (2.4 MB). 2020, Feb, 23rd, 16:00. For lat = 41.775447, long = -86.572467

@Gui-FernandesBR
Copy link
Member

Reference in

Thank you! let me try to add a new test using your file, just a sec.

@Gui-FernandesBR
Copy link
Member

Well, I tried with the ndrt_2020_weather_data_ERA5_new but it didn't work out.

c:\Users\guiga\Documents\github\RocketPy\tests\acceptance\test_ndrt_2020_rocket.py::test_ndrt_2020_rocket_data_asserts_acceptance[data/weather/ndrt_2020_weather_data_ERA5_new.nc] failed: env_file = 'data/weather/ndrt_2020_weather_data_ERA5_new.nc'

    @pytest.mark.parametrize(
        "env_file",
        [
            "data/weather/ndrt_2020_weather_data_ERA5.nc",
            "data/weather/ndrt_2020_weather_data_ERA5_new.nc",
        ],
    )
    def test_ndrt_2020_rocket_data_asserts_acceptance(env_file):
        # Notre Dame Rocket Team 2020 Flight
        # Launched at 19045-18879 Avery Rd, Three Oaks, MI 49128
        # Permission to use flight data given by Brooke Mumma, 2020
        #
        # IMPORTANT RESULTS  (23rd feb)
        # Measured Stability Margin 2.875 cal
        # Official Target Altitude 4,444 ft
        # Measured Altitude 4,320 ft or 1316.736 m
        # Drift: 2275 ft
    
        # Importing libraries
    
        # Defining all parameters
        parameters = {
            # Mass Details
            "rocket_mass": (23.321 - 2.475 - 1, 0.010),
            # propulsion details
            "impulse": (4895.050, 0.033 * 4895.050),
            "burn_time": (3.45, 0.1),
            "nozzle_radius": (49.5 / 2000, 0.001),
            "throat_radius": (21.5 / 2000, 0.001),
            "grain_separation": (3 / 1000, 0.001),
            "grain_density": (1519.708, 30),
            "grain_outer_radius": (33 / 1000, 0.001),
            "grain_initial_inner_radius": (15 / 1000, 0.002),
            "grain_initial_height": (120 / 1000, 0.001),
            # aerodynamic details
            "drag_coefficient": (0.44, 0.1),
            "inertia_i": (83.351, 0.3 * 83.351),
            "inertia_z": (0.15982, 0.3 * 0.15982),
            "radius": (203 / 2000, 0.001),
            "distance_rocket_nozzle": (-1.255, 0.100),
            "distance_rocket_propellant": (-0.85704, 0.100),
            "power_off_drag": (1, 0.033),
            "power_on_drag": (1, 0.033),
            "nose_length": (0.610, 0.001),
            "nose_distance_to_cm": (0.71971, 0.100),
            "fin_span": (0.165, 0.001),
            "fin_root_chord": (0.152, 0.001),
            "fin_tip_chord": (0.0762, 0.001),
            "fin_distance_to_cm": (-1.04956, 0.100),
            "transition_top_radius": (203 / 2000, 0.010),
            "transition_bottom_radius": (155 / 2000, 0.010),
            "transition_length": (0.127, 0.010),
            "transition_distance_to_cm": (-1.194656, 0.010),
            # launch and environment details
            "wind_direction": (0, 3),
            "wind_speed": (1, 0.30),
            "inclination": (90, 1),
            "heading": (181, 3),
            "rail_length": (3.353, 0.001),
            # parachute details
            "cd_s_drogue": (1.5 * np.pi * (24 * 25.4 / 1000) * (24 * 25.4 / 1000) / 4, 0.1),
            "cd_s_main": (2.2 * np.pi * (120 * 25.4 / 1000) * (120 * 25.4 / 1000) / 4, 0.1),
            "lag_rec": (1, 0.5),
        }
    
        # Environment conditions
        env = Environment(
            gravity=9.81,
            latitude=41.775447,
            longitude=-86.572467,
            date=(2020, 2, 23, 16),
            elevation=206,
        )
>       env.set_atmospheric_model(
            type="Reanalysis",
            file=env_file,
            dictionary="ECMWF",
        )

tests\acceptance\test_ndrt_2020_rocket.py:82: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
rocketpy\environment\environment.py:1275: in set_atmospheric_model
    self.process_forecast_reanalysis(dataset, dictionary)
rocketpy\environment\environment.py:1915: in process_forecast_reanalysis
    self.atmospheric_model_interval = get_interval_date_from_time_array(time_array)
rocketpy\environment\tools.py:435: in get_interval_date_from_time_array
    return netCDF4.num2date(
src\\cftime\\_cftime.pyx:630: in cftime._cftime.num2date
    ???
src\\cftime\\_cftime.pyx:474: in cftime._cftime.decode_date_from_scalar
    ???
.venv\Lib\site-packages\numpy\ma\core.py:4317: in __radd__
    return add(other, self)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <numpy.ma.core._MaskedBinaryOperation object at 0x00000284DAF55130>
a = cftime.DatetimeGregorian(1970, 1, 1, 0, 0, 0, 0, has_year_zero=False)
b = masked_array(data=--,
             mask=True,
       fill_value=np.str_('?'),
            dtype=object)
args = (), kwargs = {}
da = array(cftime.DatetimeGregorian(1970, 1, 1, 0, 0, 0, 0, has_year_zero=False),
      dtype=object)
db = array(datetime.timedelta(0), dtype=object)
result = cftime.DatetimeGregorian(1970, 1, 1, 0, 0, 0, 0, has_year_zero=False)
ma = np.False_, mb = array(True), m = np.True_

    def __call__(self, a, b, *args, **kwargs):
        """
        Execute the call behavior.
    
        """
        # Get the data, as ndarray
        (da, db) = (getdata(a), getdata(b))
        # Get the result
        with np.errstate():
            np.seterr(divide='ignore', invalid='ignore')
            result = self.f(da, db, *args, **kwargs)
        # Get the mask for the result
        (ma, mb) = (getmask(a), getmask(b))
        if ma is nomask:
            if mb is nomask:
                m = nomask
            else:
                m = umath.logical_or(getmaskarray(a), mb)
        elif mb is nomask:
            m = umath.logical_or(ma, getmaskarray(b))
        else:
            m = umath.logical_or(ma, mb)
    
        # Case 1. : scalar
>       if not result.ndim:
E       AttributeError: 'cftime._cftime.DatetimeGregorian' object has no attribute 'ndim'

.venv\Lib\site-packages\numpy\ma\core.py:1068: AttributeError

@WilliamArmst
Copy link
Contributor Author

get_interval_date_from_time_array didn't know how to get a time interval from an array with only one item. Actually should I have changed the get_interval_date_from_time_array specifically instead of trying to bypass it in this specific case?

@Gui-FernandesBR
Copy link
Member

get_interval_date_from_time_array didn't know how to get a time interval from an array with only one item. Actually should I have changed the get_interval_date_from_time_array specifically instead of trying to bypass it in this specific case?

Actually, it worked as a charm.
You have great debugging skills.

Copy link
Member

@Gui-FernandesBR Gui-FernandesBR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, nice fix!

@WilliamArmst
Copy link
Contributor Author

Thx for all of your help with this one and #735, really appreciate it.

@Gui-FernandesBR Gui-FernandesBR merged commit 1f3eefc into RocketPy-Team:develop Nov 23, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

BUG: Problem with NetCFD4 file from CDS
2 participants