RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center to connect disconnected flowline networks and assimilate using observations #847

taddyb · 2024-12-17T02:30:05Z

Today, Raytheon showcased the following code to NOAA-OWP demonstrating the ability for a graph breakage at the old-river-control center to be connected and assimilated within T-Route using USGS observations. This PR is the accompanying code that contains only feature code related to that release (Note: provided test files are in test/water_transfer and this only works with Hydrofabric v20.1

NOTE: A significant number of these additions/files come from including forcing files and a jupyter notebook to test this code

Additions

If you look in our config file, we're using the following fields to reference the data:

compute_parameters:
    data_assimilation_parameters:
        divergence_outflow : divergence_obs/old_river_lock.csv

With an addition to compute_parameters.py:

divergence_outflow: Optional[FilePath] = None
"""
CSV file containing DA values for man-made breakages in the hydrofabric (locks). Needs to be obtained and pre-processed from https://waterdata.usgs.gov/monitoring-location/ observations
NOTE: Required for Old River Lock DA (2.2.3.15.2).
"""

and a new class in the DataAssimilation.py file to follow the patterns within the file

class Divergence(AbstractDA):
    def __init__(self, network, from_files, value_dict):
        LOG.info("divergence class is started.")
        start_time = time.time()
        data_assimilation_parameters = self._data_assimilation_parameters
        run_parameters = self._run_parameters

        self._divergence_df = pd.DataFrame()
        
        if data_assimilation_parameters.get("divergence_outflow", False):
            self._divergence_df = _create_divergence_df(run_parameters, network, data_assimilation_parameters)

        LOG.debug("divergence class is completed in %s seconds." % (time.time() - start_time))

Removals

Changes

Updates to the LICENSE to include information that this work was paid for by a government contract, and data attributions to Lynker Spatial for geopackage files for hydrofabric v20.1

Testing

See test/water_transfer/README.md for testing information

Screenshots

Notes

Todos

Checklist

Testing checklist

Target Environment support

Windows
Linux
Browser

Accessibility

Keyboard friendly
Screen reader friendly

Other

…to old-river-lock

shorvath-noaa · 2025-03-05T19:06:14Z

I believe this PR is diverting surface runoff (q_lateral) rather than streamflow (q0). This is an issue as the streamflow being used for data assimilation from USGS gage 07381482 will be orders of magnitude greater than the surface runoff going into the segment ID that water is being "diverted" from.

The testing example of this PR is also confusing q_lateral and q0, as can be seen in the forcing files in the test/water_transfer/channel_forcing/ directory. In the test/water_transfer/channel_forcing/20230101_20230110.parquet file, where I assume the CHRTOUT_DOMAIN1.csv files were derived from, the values being used to populate the q_lateral forcing files is actually the "streamflow" variable in the parquet file:

CHRTOUT (forcing) file:

parquet file:

This is providing the streamflow of the segment ID as the forcing q_lateral, which is unrealistically high for a forcing value. The issue here will come when the network.diverge_flow() function compares the USGS gage provided flow with a true q_lateral value going into the segment ID from which flow is supposed to be diverted. The q_lateral value will be much smaller than the USGS observed flow, so this function will simply zero out the flow to be diverted.

shorvath-noaa · 2025-03-05T19:21:49Z

This is from the NWMv3.0 retrospective 202301010000.CHRTOUT_DOMAIN1 file downloaded from AWS:

q_lateral is ~0.5 and streamflow is ~12,200

taddyb · 2025-03-05T23:14:23Z

Thanks for the feedback, @shorvath-noaa

It seems like it's a simple switch from adding flows from the Old River Lock Gauge to self._qlateral to self._q0 inside of the divergence function is necessary as the observations should be considered streamflow rather than lateral inflow

https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-network/troute/HYFeaturesNetwork.py#L746

Regarding the q_lateral from the retrospective, one problem that arises from this test case is how large the subsets are as the Old River Lock is in VPU 8. The input files required for testing are > 50MB (github's file size limit) and if we were to have q_lateral from the NWM-retrospective for all upstream catchments, the amount of data uploaded into this repo would be significant. Thus, to ensure testability when delivering I used the streamflow from the retrospective set at a point just above the old river lock to ensure the flows were accurate as streamflow is comparable in that situation. My question is how to ensure your requests of q_lateral within the CHRTOUT_DOMAIN1.csv files are met while ensuring we're not uploading data to this repo that is too large.

shorvath-noaa · 2025-03-06T16:45:27Z

@taddyb I'm not sure simply switching self._qlateral to self._q0 will suffice. self._q0 is just a snapshot of the flow at the last computed timestep (not the entire timeseries of flow, velocity, and depth). The difficulty here is that when t-route calls nwm_route(), it calculates several timesteps in one go. Exactly how many is user defined with the compute_parameters -> forcing_parameters -> max_loop_size: parameter (default 24 hours). So applying DA just prior to nwm_route() means t-route is really only forcing this diversion once every 24 hours (or even less frequently if the user increases the max_loop_size parameter).

I'm not familiar with what the mandate was for this new DA method, but I imagine it's desirable for it to be similar to streamflow_da, which interpolates 15min gage data into 5 minute intervals (matching t-route's internal time step), and then applies DA at every time step. Do you know if that's the case?

shorvath-noaa · 2025-03-06T16:50:06Z

If I remember correctly, this is the first of several diversion locations that should eventually be implemented in t-route. If that's the case, I also think it's worth making this more generic so it can be easily extended to future locations.

shorvath-noaa · 2025-03-06T16:55:29Z

As to the second point, I agree that it's not good to add tons of data files to the repo. One option would be to create a restart file for this test which puts water in the channels at initialization, then just include a subset of forcing data for the domain. Or I suppose some clever unit tests could capture whether this diversion is working properly.

taddyb · 2025-03-06T21:20:05Z

I'll try to address all of the comments in one message. Hopefully it won't get too winded/long

@taddyb I'm not sure simply switching self._qlateral to self._q0 will suffice. self._q0 is just a snapshot of the flow at the last computed timestep (not the entire timeseries of flow, velocity, and depth). The difficulty here is that when t-route calls nwm_route(), it calculates several timesteps in one go. Exactly how many is user defined with the compute_parameters -> forcing_parameters -> max_loop_size: parameter (default 24 hours). So applying DA just prior to nwm_route() means t-route is really only forcing this diversion once every 24 hours (or even less frequently if the user increases the max_loop_size parameter).

I'm not familiar with what the mandate was for this new DA method, but I imagine it's desirable for it to be similar to streamflow_da, which interpolates 15min gage data into 5 minute intervals (matching t-route's internal time step), and then applies DA at every time step. Do you know if that's the case?

The deliverable is: "Perform a simulation in which a time series of streamflow values is subtracted from the flow of the Mississippi River and is added to the flow of the Atchafalaya River at the Old River Control Structure.assimilated into the T-route channel network as a subtraction from flow at one node and as an addition to flow at another node. Demonstrate the above simulation in a way that the last specified value in the flow time series final assimilated value persists instead of transitioning back to the simulated flow."

Looking more into the self.q0 object, you're right that setting the qu0 (upstream discharge) would only update flow every 24 hours. If you wanted to mimic the streamflow da, from my understanding one would have to make edits to the fp_da_map similar to how usgs_df is handled (https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-routing/troute/routing/diffusive_utils_v02.py#L512). However, things get messy with the JIT parallel computing as the Atchafalaya and Mississippi Rivers are on two different stems. Each currently is computed in parallel, and might create a bottleneck if one river is waiting for the other to route its flows. From a hydrological perspective, using q_lateral as an "insertion" of flow would work, but I see your point in using it to subtract flows may get dicey if there isn't sufficient lateral flow.

If I remember correctly, this is the first of several diversion locations that should eventually be implemented in t-route. If that's the case, I also think it's worth making this more generic so it can be easily extended to future locations.

Correct! This was supposed to be a specific, targeted, notebook and example for how one would subtract streamflow from one river reach into the other. Examples of where this will be made more generic (this is planned work for the future) is:

more customization of timestep length
- this is hardcoded to 24 hours
- https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-network/troute/HYFeaturesNetwork.py#L726
Using POIs in the hydrofabric to identify where there are divergences where flow has to be moved
- Currently the HF IDs are hardcoded
- https://github.com/NGWPC/t-route/blob/9e84a6e205abe94f64dbdc566627825cda1b5348/src/troute-network/troute/DataAssimilation.py#L1215

As to the second point, I agree that it's not good to add tons of data files to the repo. One option would be to create a restart file for this test which puts water in the channels at initialization, then just include a subset of forcing data for the domain. Or I suppose some clever unit tests could capture whether this diversion is working properly.

restart files are a good idea. But, just know I'm always down for some clever unit tests 👍

shorvath-noaa · 2025-03-07T18:14:10Z

The deliverable is: "Perform a simulation in which a time series of streamflow values is subtracted from the flow of the Mississippi River and is added to the flow of the Atchafalaya River at the Old River Control Structure.assimilated into the T-route channel network as a subtraction from flow at one node and as an addition to flow at another node. Demonstrate the above simulation in a way that the last specified value in the flow time series final assimilated value persists instead of transitioning back to the simulated flow."

Thanks for providing this! It does seem to me that the flow should be diverted at every time step (I believe that is what "time series of streamflow" means). So I think this DA method needs to be performed somewhere within nwm_route() rather than outside of it.

Looking more into the self.q0 object, you're right that setting the qu0 (upstream discharge) would only update flow every 24 hours. If you wanted to mimic the streamflow da, from my understanding one would have to make edits to the fp_da_map similar to how usgs_df is handled (https://github.com/NGWPC/t-route/blob/845781c06987561276d17a370b4887ccefb462be/src/troute-routing/troute/routing/diffusive_utils_v02.py#L512). However, things get messy with the JIT parallel computing as the Atchafalaya and Mississippi Rivers are on two different stems. Each currently is computed in parallel, and might create a bottleneck if one river is waiting for the other to route its flows. From a hydrological perspective, using q_lateral as an "insertion" of flow would work, but I see your point in using it to subtract flows may get dicey if there isn't sufficient lateral flow.

I think you hit the nail on the head here! Extracting flow from one network and moving it to another seems to be the real challenge. Difficult, but I think doable.

I believe providing the flow to the Atchafalaya via q_lateral would produce the correct flow there, but that still leaves the issue of extracting flow from the Mississippi's q_lateral. That q_lateral will be orders of magnitude smaller than the streamflow, so t-route will (I think always) run into the issue of trying to divert a large streamflow from a much smaller q_lateral. It seems to me that the only way to properly do this is to extract it from the streamflow itself.

taddyb · 2025-03-07T18:26:19Z

I think we're on the same page then with what needs to be done with for this PR to be merged. imho, any future deliveries related to this capability should be tacked onto this PR as:

what currently exists is an example of stream divergence at one location
there is future work to extend this to many other locations and for it to be generic
the points about q0 streamflow subtraction rather than q_lateral that are mentioned above

I think the work is fine, but it shouldn't be merged yet as it's not a "large-scale" solution/ as you said needs to be more generic

I'm also a fan of squashing these commits once the full capability is built out as the commit history is already long.

I appreciate the comments/feedback and code review @shorvath-noaa

alex.ju and others added 27 commits October 11, 2024 13:00

add first set of unit tests in github workflow

b0250b6

add unit tests for troute-config

4a70256

change workflow name

394deb4

Revert "Add Unit Test to Github Action Workflow"

9194fa5

adding tests for water transfer

549f4d6

removing the geopackages and adding divergence obs

58cf77b

adding divergence forcings, obs, and config entries

3d6dc2d

finished flow divergence

3c65b1b

finished water transfer work

fdfa250

cleaned up comments and formatting

35d628e

added notebook dockerfile for OE testing

dd06db7

found a bug in discharge calc

c81781f

deleting unneccessary files:

67f8be2

adding saved change to notebook

03e53ce

Update dockerfile_notebook.md

4f5bbe6

adding support for better last obs handling

da6ea98

Merge branch 'old-river-lock' of https://github.com/taddyb/t-route in…

fea3744

…to old-river-lock

reformatted plots

adbef61

squashed merge from development

195e216

reformatted requirements and .gitignore

8e74c21

reformatting from pull

76fbd9a

added new files from git checkpoint

73feb12

deleting API from feature to simplify commit

b7066c3

Update README.md

60c1329

updating license

9940276

addoing pyproject

03339fe

updates to dockerfile from cherry-picked commits

a3d0793

removing unneccessary __init__ global import

9e84a6e

taddyb marked this pull request as draft December 17, 2024 22:35

This was referenced Jan 6, 2025

fix: DataFrame merge error on id field for HF >= 2.2 #851

Closed

fix: DataFrame merge error on id field for HF >= 2.2 shorvath-noaa/t-route#3

Merged

taddyb marked this pull request as ready for review January 10, 2025 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center to connect disconnected flowline networks and assimilate using observations #847

RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center to connect disconnected flowline networks and assimilate using observations #847

Uh oh!

taddyb commented Dec 17, 2024

Uh oh!

shorvath-noaa commented Mar 5, 2025

Uh oh!

shorvath-noaa commented Mar 5, 2025

Uh oh!

taddyb commented Mar 5, 2025

Uh oh!

shorvath-noaa commented Mar 6, 2025

Uh oh!

shorvath-noaa commented Mar 6, 2025

Uh oh!

shorvath-noaa commented Mar 6, 2025

Uh oh!

taddyb commented Mar 6, 2025 •

edited

Loading

Uh oh!

shorvath-noaa commented Mar 7, 2025

Uh oh!

taddyb commented Mar 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center to connect disconnected flowline networks and assimilate using observations #847

Are you sure you want to change the base?

RTX Code Release (PI-4): Additions to T-Route to be run at the Old River Control Center to connect disconnected flowline networks and assimilate using observations #847

Uh oh!

Conversation

taddyb commented Dec 17, 2024

Additions

Removals

Changes

Testing

Screenshots

Notes

Todos

Checklist

Testing checklist

Target Environment support

Accessibility

Other

Uh oh!

shorvath-noaa commented Mar 5, 2025

Uh oh!

shorvath-noaa commented Mar 5, 2025

Uh oh!

taddyb commented Mar 5, 2025

Uh oh!

shorvath-noaa commented Mar 6, 2025

Uh oh!

shorvath-noaa commented Mar 6, 2025

Uh oh!

shorvath-noaa commented Mar 6, 2025

Uh oh!

taddyb commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shorvath-noaa commented Mar 7, 2025

Uh oh!

taddyb commented Mar 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

taddyb commented Mar 6, 2025 •

edited

Loading