Create tool to retrieve rating curves from USGS sites and convert to …

…elevation (NAVD88). Intended to be used as part of the Sierra Test. - Modify usgs_gage_crosswalk.py to: - Look for location_id instead of site_no attribute field in usgs_gages.gpkg file. - Filter out gages that do not have rating curves included in the usgs_rating_curves.csv - Modify rating_curve_comparison.py to perform a check on the age of the user specified usgs_rating_curves.csv and alert user to the age of the file and recommend updating if file is older the 30 days. - Add rating_curve_get_usgs_curves.py. This script will generate the following files: - usgs_rating_curves.csv: A csv file that contains rating curves (including converted to NAVD88 elevation) for USGS gages in a format that is compatible with rating_curve_comparisons.py. As it is is currently configured, only gages within CONUS will have rating curve data. - log.csv: A log file that records status for each gage and includes error messages. - usgs_gages.gpkg: A geospatial layer (in FIM projection) of all active USGS gages that meet a predefined criteria. Additionally, the curve attribute indicates whether a rating curve is found in the usgs_rating_curves.csv. This spatial file is only generated if the all option is passed with the -l argument. This resolves #289.
NOAA-OWP · Apr 5, 2021 · 40631f4 · 40631f4
1 parent 44b6333
commit 40631f4
Show file tree

Hide file tree

Showing 5 changed files with 556 additions and 15 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,17 +1,32 @@
 All notable changes to this project will be documented in this file.
 We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.
+## v3.0.14.0 - 2021-04-05 - [PR #338](https://github.com/NOAA-OWP/cahaba/pull/338)
 
-## v3.0.12.2 - 2021-04-01 - [PR #332](https://github.com/NOAA-OWP/cahaba/pull/332)
+Create tool to retrieve rating curves from USGS sites and convert to elevation (NAVD88). Intended to be used as part of the Sierra Test.
+
+### Changes
+ - Modify `usgs_gage_crosswalk.py` to:
+    1) Look for `location_id` instead of `site_no` attribute field in `usgs_gages.gpkg` file.
+    2) Filter out gages that do not have rating curves included in the `usgs_rating_curves.csv`.
+ - Modify `rating_curve_comparison.py` to perform a check on the age of the user specified `usgs_rating_curves.csv` and alert user to the age of the file and recommend updating if file is older the 30 days.
+
+### Additions
+ - Add `rating_curve_get_usgs_curves.py`. This script will generate the following files:
+     1) `usgs_rating_curves.csv`: A csv file that contains rating curves (including converted to NAVD88 elevation) for USGS gages in a format that is compatible with  `rating_curve_comparisons.py`. As it is is currently configured, only gages within CONUS will have rating curve data.
+     2) `log.csv`: A log file that records status for each gage and includes error messages.
+     3) `usgs_gages.gpkg`: A geospatial layer (in FIM projection) of all active USGS gages that meet a predefined criteria. Additionally, the `curve` attribute indicates whether a rating curve is found in the `usgs_rating_curves.csv`. This spatial file is only generated if the `all` option is passed with the `-l` argument.
+<br/><br/>
+## v3.0.13.0 - 2021-04-01 - [PR #332](https://github.com/NOAA-OWP/cahaba/pull/332)
 
-Created tool to compare synthetic rating curve with benchmark rating curve (sierra test). resolves issue #293; resolves issue #308
+Created tool to compare synthetic rating curve with benchmark rating curve (Sierra Test).
 
 ### Changes
- - update `aggregate_fim_outputs.py` call argument in `fim_run.sh` from 4 jobs to 6 jobs (optimizing API performance)
- - reroutes median elevation data from `add_crosswalk.py` and `rem.py` to new file (depreciating `hand_ref_elev_table.csv`)
- - adding new files to `viz_whitelist` in `output_cleanup.py`
+ - Update `aggregate_fim_outputs.py` call argument in `fim_run.sh` from 4 jobs to 6 jobs, to optimize API performance.
+ - Reroutes median elevation data from `add_crosswalk.py` and `rem.py` to new file (depreciating `hand_ref_elev_table.csv`).
+ - Adds new files to `viz_whitelist` in `output_cleanup.py`.
 
 ### Additions
- - `usgs_gage_crosswalk.py`: generates `usgs_elev_table.csv` in run_by_unit.py with elevation and additional attributes at USGS gages. 
+ - `usgs_gage_crosswalk.py`: generates `usgs_elev_table.csv` in `run_by_unit.py` with elevation and additional attributes at USGS gages. 
  - `rating_curve_comparison.py`: post-processing script to plot and calculate metrics between synthetic rating curves and USGS rating curve data. 
 <br/><br/>
 
@@ -36,7 +51,7 @@ Add more detail/information to plotting capabilities.
 ### Changes
  - Merge `plot_functions.py` into `eval_plots.py` and move `eval_plots.py` into the tools directory.
  - Remove `plots` subdirectory.
- - 
+
 ### Additions
  - Optional argument to create barplots of CSI for each individual site.
  - Create a csv containing the data used to create the scatterplots. 

diff --git a/src/usgs_gage_crosswalk.py b/src/usgs_gage_crosswalk.py
@@ -41,15 +41,18 @@ def crosswalk_usgs_gage(usgs_gages_filename,dem_filename,input_flows_filename,in
     input_catchment = gpd.read_file(input_catchment_filename)
     dem_adj = rasterio.open(dem_adj_filename,'r')
 
+    #Query out usgs_gages that don't have rating curve data
+    usgs_gages = usgs_gages.query('curve == "yes"')
+
     if input_flows.HydroID.dtype != 'int': input_flows.HydroID = input_flows.HydroID.astype(int)
 
     # Identify closest HydroID
     closest_catchment = gpd.sjoin(usgs_gages, input_catchment, how='left', op='within').reset_index(drop=True)
-    closest_hydro_id = closest_catchment.filter(items=['site_no','HydroID','min_thal_elev','med_thal_elev','max_thal_elev', 'order_'])
+    closest_hydro_id = closest_catchment.filter(items=['location_id','HydroID','min_thal_elev','med_thal_elev','max_thal_elev', 'order_'])
     closest_hydro_id = closest_hydro_id.dropna()
 
     # Get USGS gages that are within catchment boundaries
-    usgs_gages = usgs_gages.loc[usgs_gages.site_no.isin(list(closest_hydro_id.site_no))]
+    usgs_gages = usgs_gages.loc[usgs_gages.location_id.isin(list(closest_hydro_id.location_id))]
 
     columns = ['location_id','HydroID','dem_elevation','dem_adj_elevation','min_thal_elev', 'med_thal_elev','max_thal_elev','str_order']
     gage_data = []
@@ -58,11 +61,11 @@ def crosswalk_usgs_gage(usgs_gages_filename,dem_filename,input_flows_filename,in
     for index, gage in usgs_gages.iterrows():
 
         # Get stream attributes
-        hydro_id = closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].HydroID.item()
-        str_order = str(int(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].order_.item()))
-        min_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].min_thal_elev.item(),2)
-        med_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].med_thal_elev.item(),2)
-        max_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].max_thal_elev.item(),2)
+        hydro_id = closest_hydro_id.loc[closest_hydro_id.location_id==gage.location_id].HydroID.item()
+        str_order = str(int(closest_hydro_id.loc[closest_hydro_id.location_id==gage.location_id].order_.item()))
+        min_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.location_id==gage.location_id].min_thal_elev.item(),2)
+        med_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.location_id==gage.location_id].med_thal_elev.item(),2)
+        max_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.location_id==gage.location_id].max_thal_elev.item(),2)
 
         # Convert headwater point geometries to WKB representation
         wkb_gages = dumps(gage.geometry)
@@ -90,7 +93,7 @@ def crosswalk_usgs_gage(usgs_gages_filename,dem_filename,input_flows_filename,in
         dem_adj_elev = round(list(rasterio.sample.sample_gen(dem_adj,shply_referenced_gage.coords))[0].item(),2)
 
         # Append dem_m_elev, dem_adj_elev, hydro_id, and gage number to table
-        site_elevations = [str(gage.site_no), str(hydro_id), dem_m_elev, dem_adj_elev, min_thal_elev, med_thal_elev, max_thal_elev,str(str_order)]
+        site_elevations = [str(gage.location_id), str(hydro_id), dem_m_elev, dem_adj_elev, min_thal_elev, med_thal_elev, max_thal_elev,str(str_order)]
         gage_data.append(site_elevations)
 
 

diff --git a/tools/rating_curve_comparison.py b/tools/rating_curve_comparison.py
@@ -13,6 +13,8 @@
 from os.path import isfile, join, dirname
 import shutil
 import warnings
+from pathlib import Path
+import time
 warnings.simplefilter(action='ignore', category=FutureWarning)
 
 """
@@ -33,6 +35,26 @@
     stat_groups : str
         string of columns to group eval metrics.
 """
+def check_file_age(file):
+    '''
+    Checks if file exists, determines the file age, and recommends
+    updating if older than 1 month. 
+
+    Returns
+    -------
+    None.
+
+    '''    
+    file = Path(file)
+    if file.is_file():
+        modification_time = file.stat().st_mtime
+        current_time = time.time()
+        file_age_days = (current_time - modification_time)/86400
+        if file_age_days > 30:
+            check = f'{file.name} is {int(file_age_days)} days old, consider updating.\nUpdate with rating_curve_get_usgs_curves.py'
+        else:
+            check = f'{file.name} is {int(file_age_days)} days old.'
+    return check
 
 # recurr_intervals = ['recurr_1_5_cms.csv','recurr_5_0_cms.csv','recurr_10_0_cms.csv']
 
@@ -374,6 +396,9 @@ def calculate_rc_stats_elev(rc,stat_groups=None):
     tables_dir = join(output_dir,'tables')
     os.makedirs(tables_dir, exist_ok=True)
 
+    #Check age of gages csv and recommend updating if older than 30 days.
+    print(check_file_age(usgs_gages_filename))
+
     # Open log file
     sys.__stdout__ = sys.stdout
     log_file = open(join(output_dir,'rating_curve_comparison.log'),"w")