Skip to content

Conversation

taddyb
Copy link

@taddyb taddyb commented Dec 17, 2024

Today, Raytheon showcased the following code to NOAA-OWP demonstrating the ability for a graph breakage at the old-river-control center to be connected and assimilated within T-Route using USGS observations. This PR is the accompanying code that contains only feature code related to that release (Note: provided test files are in test/water_transfer and this only works with Hydrofabric v20.1

NOTE: A significant number of these additions/files come from including forcing files and a jupyter notebook to test this code

Additions

If you look in our config file, we're using the following fields to reference the data:

compute_parameters:
    data_assimilation_parameters:
        divergence_outflow : divergence_obs/old_river_lock.csv

With an addition to compute_parameters.py:

divergence_outflow: Optional[FilePath] = None
"""
CSV file containing DA values for man-made breakages in the hydrofabric (locks). Needs to be obtained and pre-processed from https://waterdata.usgs.gov/monitoring-location/ observations
NOTE: Required for Old River Lock DA (2.2.3.15.2).
"""

and a new class in the DataAssimilation.py file to follow the patterns within the file

class Divergence(AbstractDA):
    def __init__(self, network, from_files, value_dict):
        LOG.info("divergence class is started.")
        start_time = time.time()
        data_assimilation_parameters = self._data_assimilation_parameters
        run_parameters = self._run_parameters

        self._divergence_df = pd.DataFrame()
        
        if data_assimilation_parameters.get("divergence_outflow", False):
            self._divergence_df = _create_divergence_df(run_parameters, network, data_assimilation_parameters)

        LOG.debug("divergence class is completed in %s seconds." % (time.time() - start_time))

Removals

Changes

  • Updates to the LICENSE to include information that this work was paid for by a government contract, and data attributions to Lynker Spatial for geopackage files for hydrofabric v20.1

Testing

  1. See test/water_transfer/README.md for testing information

Screenshots

image

Notes

Todos

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Visually tested in supported browsers and devices (see checklist below 👇)
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist

Target Environment support

  • Windows
  • Linux
  • Browser

Accessibility

  • Keyboard friendly
  • Screen reader friendly

Other

  • Is useable without CSS
  • Is useable without JS
  • Flexible from small to large screens
  • No linting errors or warnings
  • JavaScript tests are passing

@taddyb taddyb changed the title RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center and connect a disconnected graph RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center to connect disconnected flowline networks and assimilate using observations Dec 17, 2024
@taddyb taddyb marked this pull request as draft December 17, 2024 22:35
@shorvath-noaa
Copy link
Collaborator

I believe this PR is diverting surface runoff (q_lateral) rather than streamflow (q0). This is an issue as the streamflow being used for data assimilation from USGS gage 07381482 will be orders of magnitude greater than the surface runoff going into the segment ID that water is being "diverted" from.

The testing example of this PR is also confusing q_lateral and q0, as can be seen in the forcing files in the test/water_transfer/channel_forcing/ directory. In the test/water_transfer/channel_forcing/20230101_20230110.parquet file, where I assume the CHRTOUT_DOMAIN1.csv files were derived from, the values being used to populate the q_lateral forcing files is actually the "streamflow" variable in the parquet file:

CHRTOUT (forcing) file:
image

parquet file:
image

This is providing the streamflow of the segment ID as the forcing q_lateral, which is unrealistically high for a forcing value. The issue here will come when the network.diverge_flow() function compares the USGS gage provided flow with a true q_lateral value going into the segment ID from which flow is supposed to be diverted. The q_lateral value will be much smaller than the USGS observed flow, so this function will simply zero out the flow to be diverted.

@shorvath-noaa
Copy link
Collaborator

This is from the NWMv3.0 retrospective 202301010000.CHRTOUT_DOMAIN1 file downloaded from AWS:

image

q_lateral is ~0.5 and streamflow is ~12,200

@taddyb
Copy link
Author

taddyb commented Mar 5, 2025

Thanks for the feedback, @shorvath-noaa

It seems like it's a simple switch from adding flows from the Old River Lock Gauge to self._qlateral to self._q0 inside of the divergence function is necessary as the observations should be considered streamflow rather than lateral inflow

https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-network/troute/HYFeaturesNetwork.py#L746

Regarding the q_lateral from the retrospective, one problem that arises from this test case is how large the subsets are as the Old River Lock is in VPU 8. The input files required for testing are > 50MB (github's file size limit) and if we were to have q_lateral from the NWM-retrospective for all upstream catchments, the amount of data uploaded into this repo would be significant. Thus, to ensure testability when delivering I used the streamflow from the retrospective set at a point just above the old river lock to ensure the flows were accurate as streamflow is comparable in that situation. My question is how to ensure your requests of q_lateral within the CHRTOUT_DOMAIN1.csv files are met while ensuring we're not uploading data to this repo that is too large.

@shorvath-noaa
Copy link
Collaborator

@taddyb I'm not sure simply switching self._qlateral to self._q0 will suffice. self._q0 is just a snapshot of the flow at the last computed timestep (not the entire timeseries of flow, velocity, and depth). The difficulty here is that when t-route calls nwm_route(), it calculates several timesteps in one go. Exactly how many is user defined with the compute_parameters -> forcing_parameters -> max_loop_size: parameter (default 24 hours). So applying DA just prior to nwm_route() means t-route is really only forcing this diversion once every 24 hours (or even less frequently if the user increases the max_loop_size parameter).

I'm not familiar with what the mandate was for this new DA method, but I imagine it's desirable for it to be similar to streamflow_da, which interpolates 15min gage data into 5 minute intervals (matching t-route's internal time step), and then applies DA at every time step. Do you know if that's the case?

@shorvath-noaa
Copy link
Collaborator

If I remember correctly, this is the first of several diversion locations that should eventually be implemented in t-route. If that's the case, I also think it's worth making this more generic so it can be easily extended to future locations.

@shorvath-noaa
Copy link
Collaborator

As to the second point, I agree that it's not good to add tons of data files to the repo. One option would be to create a restart file for this test which puts water in the channels at initialization, then just include a subset of forcing data for the domain. Or I suppose some clever unit tests could capture whether this diversion is working properly.

@taddyb
Copy link
Author

taddyb commented Mar 6, 2025

I'll try to address all of the comments in one message. Hopefully it won't get too winded/long

@taddyb I'm not sure simply switching self._qlateral to self._q0 will suffice. self._q0 is just a snapshot of the flow at the last computed timestep (not the entire timeseries of flow, velocity, and depth). The difficulty here is that when t-route calls nwm_route(), it calculates several timesteps in one go. Exactly how many is user defined with the compute_parameters -> forcing_parameters -> max_loop_size: parameter (default 24 hours). So applying DA just prior to nwm_route() means t-route is really only forcing this diversion once every 24 hours (or even less frequently if the user increases the max_loop_size parameter).

I'm not familiar with what the mandate was for this new DA method, but I imagine it's desirable for it to be similar to streamflow_da, which interpolates 15min gage data into 5 minute intervals (matching t-route's internal time step), and then applies DA at every time step. Do you know if that's the case?

The deliverable is: "Perform a simulation in which a time series of streamflow values is subtracted from the flow of the Mississippi River and is added to the flow of the Atchafalaya River at the Old River Control Structure.assimilated into the T-route channel network as a subtraction from flow at one node and as an addition to flow at another node. Demonstrate the above simulation in a way that the last specified value in the flow time series final assimilated value persists instead of transitioning back to the simulated flow."

Looking more into the self.q0 object, you're right that setting the qu0 (upstream discharge) would only update flow every 24 hours. If you wanted to mimic the streamflow da, from my understanding one would have to make edits to the fp_da_map similar to how usgs_df is handled (https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-routing/troute/routing/diffusive_utils_v02.py#L512). However, things get messy with the JIT parallel computing as the Atchafalaya and Mississippi Rivers are on two different stems. Each currently is computed in parallel, and might create a bottleneck if one river is waiting for the other to route its flows. From a hydrological perspective, using q_lateral as an "insertion" of flow would work, but I see your point in using it to subtract flows may get dicey if there isn't sufficient lateral flow.

If I remember correctly, this is the first of several diversion locations that should eventually be implemented in t-route. If that's the case, I also think it's worth making this more generic so it can be easily extended to future locations.

Correct! This was supposed to be a specific, targeted, notebook and example for how one would subtract streamflow from one river reach into the other. Examples of where this will be made more generic (this is planned work for the future) is:

As to the second point, I agree that it's not good to add tons of data files to the repo. One option would be to create a restart file for this test which puts water in the channels at initialization, then just include a subset of forcing data for the domain. Or I suppose some clever unit tests could capture whether this diversion is working properly.

restart files are a good idea. But, just know I'm always down for some clever unit tests 👍

@shorvath-noaa
Copy link
Collaborator

The deliverable is: "Perform a simulation in which a time series of streamflow values is subtracted from the flow of the Mississippi River and is added to the flow of the Atchafalaya River at the Old River Control Structure.assimilated into the T-route channel network as a subtraction from flow at one node and as an addition to flow at another node. Demonstrate the above simulation in a way that the last specified value in the flow time series final assimilated value persists instead of transitioning back to the simulated flow."

Thanks for providing this! It does seem to me that the flow should be diverted at every time step (I believe that is what "time series of streamflow" means). So I think this DA method needs to be performed somewhere within nwm_route() rather than outside of it.

Looking more into the self.q0 object, you're right that setting the qu0 (upstream discharge) would only update flow every 24 hours. If you wanted to mimic the streamflow da, from my understanding one would have to make edits to the fp_da_map similar to how usgs_df is handled (https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-routing/troute/routing/diffusive_utils_v02.py#L512). However, things get messy with the JIT parallel computing as the Atchafalaya and Mississippi Rivers are on two different stems. Each currently is computed in parallel, and might create a bottleneck if one river is waiting for the other to route its flows. From a hydrological perspective, using q_lateral as an "insertion" of flow would work, but I see your point in using it to subtract flows may get dicey if there isn't sufficient lateral flow.

I think you hit the nail on the head here! Extracting flow from one network and moving it to another seems to be the real challenge. Difficult, but I think doable.

I believe providing the flow to the Atchafalaya via q_lateral would produce the correct flow there, but that still leaves the issue of extracting flow from the Mississippi's q_lateral. That q_lateral will be orders of magnitude smaller than the streamflow, so t-route will (I think always) run into the issue of trying to divert a large streamflow from a much smaller q_lateral. It seems to me that the only way to properly do this is to extract it from the streamflow itself.

@taddyb
Copy link
Author

taddyb commented Mar 7, 2025

I think we're on the same page then with what needs to be done with for this PR to be merged. imho, any future deliveries related to this capability should be tacked onto this PR as:

  • what currently exists is an example of stream divergence at one location
  • there is future work to extend this to many other locations and for it to be generic
  • the points about q0 streamflow subtraction rather than q_lateral that are mentioned above

I think the work is fine, but it shouldn't be merged yet as it's not a "large-scale" solution/ as you said needs to be more generic

I'm also a fan of squashing these commits once the full capability is built out as the commit history is already long.

I appreciate the comments/feedback and code review @shorvath-noaa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants