Bug fixes to new evaluation code and README.md cleanup

- Fixed a bug in synthesize_test_cases.py where the extent (MS/FR) was not being written to merged metrics file properly. - Fixed a bug in synthesize_test_cases.py where only BLE test cases were being written to merged metrics file. - Removed unused imports from inundation.py. - Updated README.md This resolves #270.
NOAA-OWP · Feb 24, 2021 · e2ae250 · e2ae250
1 parent ffa0a00
commit e2ae250
Show file tree

Hide file tree

Showing 6 changed files with 62 additions and 47 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,15 +1,28 @@
 All notable changes to this project will be documented in this file.
 We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.
 
-## v3.0.5.2 - 2021-02-23
+## v3.0.5.3 - 2021-02-23 - [PR #275](https://github.com/NOAA-OWP/cahaba/pull/275)
 
-Adding HAND SRC datum elev values to `hydroTable.csv` output
+Bug fixes to new evaluation code.
 
 ### Changes
 
- - Updated `add_crosswalk.py` to included "Median_Thal_Elev_m" variable outputs in hydroTable.csv
- - Renamed hydroid attribute in `rem.py` to "Median" in case we want to include other statistics in the future (e.g. min, max, range etc.)
+ - Fixed a bug in `synthesize_test_cases.py` where the extent (MS/FR) was not being written to merged metrics file properly.
+ - Fixed a bug in `synthesize_test_cases.py` where only BLE test cases were being written to merged metrics file.
+ - Removed unused imports from `inundation.py`.
+ - Updated README.md
 
+<br/><br/>
+## v3.0.5.2 - 2021-02-23 - [PR #272](https://github.com/NOAA-OWP/cahaba/pull/272)
+
+Adds HAND synthetic rating curve (SRC) datum elevation values to `hydroTable.csv` output.
+
+### Changes
+
+ - Updated `add_crosswalk.py` to included "Median_Thal_Elev_m" variable outputs in `hydroTable.cs`v.
+ - Renamed hydroid attribute in `rem.py` to "Median" in case we want to include other statistics in the future (e.g. min, max, range etc.).
+
+<br/><br/>
 ## v3.0.5.1 - 2021-02-22
 
 Fixed `TEST_CASES_DIR` path in `tests/utils/shared_variables.py`.
@@ -18,6 +31,7 @@ Fixed `TEST_CASES_DIR` path in `tests/utils/shared_variables.py`.
 
  - Removed `"_new"` from `TEST_CASES_DIR` variable.
 
+<br/><br/>
 ## v3.0.5.0 - 2021-02-22 - [PR #267](https://github.com/NOAA-OWP/cahaba/pull/267)
 
 Enhancements to allow for evaluation at AHPS sites, the generation of a query-optimized metrics CSV, and the generation of categorical FIM. This merge requires that the `/test_cases` directory be updated for all machines performing evaluation.

diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
 
 Flood inundation mapping software configured to work with the U.S. National Water Model operated and maintained by the National Oceanic and Atmospheric Administration (NOAA) National Water Center (NWC).
 
-This software uses the Height Above Nearest Drainage (HAND) method to generate Relative Elevation Models (REMs), Synthetic Rating Curves (SRCs), and catchment grids, which together are used to produce flood inundation maps (FIMs). This repository also includes functionality to generate FIMs and tests to evaluate FIM prediction skill.
+This software uses the Height Above Nearest Drainage (HAND) method to generate Relative Elevation Models (REMs), Synthetic Rating Curves (SRCs), and catchment grids. This repository also includes functionality to generate flood inundation maps (FIMs) and evaluate FIM accuracy.
 
 ## Dependencies
 
@@ -33,8 +33,7 @@ Make sure to set the config folder group to 'fim' recursively using the chown co
 The following input data sources should be downloaded and preprocessed prior to executing the preprocessing & hydrofabric generation code:
 ### USACE National Levee Database:
 - Access here: https://levees.sec.usace.army.mil/
-- Recommend downloading the “Full GeoJSON” file for the area of interest
-- Unzip data and then use the preprocessing scripts to filter data and fix geometries where needed
+- Download the “Full GeoJSON” file for the area of interest
 - Unzip data and then use the preprocessing scripts to filter data and fix geometries where needed
 
 ### NHDPlus HR datasets

diff --git a/tests/inundation.py b/tests/inundation.py
@@ -1,27 +1,23 @@
 #!/usr/bin/env python3
 
-import sys
 import numpy as np
 import pandas as pd
-from numba import njit, typeof, typed, types
+from numba import njit, typed, types
 from concurrent.futures import ThreadPoolExecutor,as_completed
 from subprocess import run
 from os.path import splitext
 import rasterio
 import fiona
-import shapely
 from shapely.geometry import shape
-from fiona.crs import to_string
-from rasterio.errors import WindowError
 from rasterio.mask import mask
 from rasterio.io import DatasetReader,DatasetWriter
-from rasterio.features import shapes,geometry_window,dataset_features
-from rasterio.windows import transform,Window
 from collections import OrderedDict
 import argparse
 from warnings import warn
 from gdal import BuildVRT
 import geopandas as gpd
+
+
 def inundate(
              rem,catchments,catchment_poly,hydro_table,forecast,mask_type,hucs=None,hucs_layerName=None,
              subset_hucs=None,num_workers=1,aggregate=False,inundation_raster=None,inundation_polygon=None,
@@ -464,7 +460,7 @@ def __subset_hydroTable_to_forecast(hydroTable,forecast,subset_hucs=None):
 
         if hydroTable.empty:
             print ("All stream segments in HUC are within lake boundaries.")
-            sys.exit(0)
+            return
 
     elif isinstance(hydroTable,pd.DataFrame):
         pass #consider checking for correct dtypes, indices, and columns

diff --git a/tests/run_test_case.py b/tests/run_test_case.py
@@ -103,11 +103,6 @@ def run_alpha_test(fim_run_dir, version, test_id, magnitude, compare_to_previous
                 lid_list.append(lid)
                 inundation_raster_list.append(os.path.join(version_test_case_dir, lid + '_inundation_extent.tif'))
                 extent_file_list.append(os.path.join(lid_dir, lid + '_extent.shp'))
-
-            ahps_inclusion_zones_dir = os.path.join(version_test_case_dir_parent, 'ahps_domains')
-
-            if not os.path.exists(ahps_inclusion_zones_dir):
-                os.mkdir(ahps_inclusion_zones_dir)
 
         else:
             benchmark_raster_file = os.path.join(TEST_CASES_DIR, benchmark_category + '_test_cases', 'validation_data_' + benchmark_category, current_huc, magnitude, benchmark_category + '_huc_' + current_huc + '_depth_' + magnitude + '.tif')
@@ -190,7 +185,7 @@ def run_alpha_test(fim_run_dir, version, test_id, magnitude, compare_to_previous
     # Parse arguments.
     parser = argparse.ArgumentParser(description='Inundation mapping and regression analysis for FOSS FIM. Regression analysis results are stored in the test directory.')
     parser.add_argument('-r','--fim-run-dir',help='Name of directory containing outputs of fim_run.sh',required=True)
-    parser.add_argument('-b', '--version-name',help='The name of the working version in which features are being tested',required=True,default="")
+    parser.add_argument('-b', '--version',help='The name of the working version in which features are being tested',required=True,default="")
     parser.add_argument('-t', '--test-id',help='The test_id to use. Format as: HUC_BENCHMARKTYPE, e.g. 12345678_ble.',required=True,default="")
     parser.add_argument('-m', '--mask-type', help='Specify \'huc\' (FIM < 3) or \'filter\' (FIM >= 3) masking method', required=False,default="huc")
     parser.add_argument('-y', '--magnitude',help='The magnitude to run.',required=False, default="")
@@ -210,14 +205,14 @@ def run_alpha_test(fim_run_dir, version, test_id, magnitude, compare_to_previous
     print()
 
     # Ensure test_id is valid.
-    if args['test_id'] not in valid_test_id_list:
-        print(TRED_BOLD + "Warning: " + WHITE_BOLD + "The provided test_id (-t) " + CYAN_BOLD + args['test_id'] + WHITE_BOLD + " is not available." + ENDC)
-        print(WHITE_BOLD + "Available test_ids include: " + ENDC)
-        for test_id in valid_test_id_list:
-          if 'validation' not in test_id.split('_') and 'ble' in test_id.split('_'):
-              print(CYAN_BOLD + test_id + ENDC)
-        print()
-        exit_flag = True
+#    if args['test_id'] not in valid_test_id_list:
+#        print(TRED_BOLD + "Warning: " + WHITE_BOLD + "The provided test_id (-t) " + CYAN_BOLD + args['test_id'] + WHITE_BOLD + " is not available." + ENDC)
+#        print(WHITE_BOLD + "Available test_ids include: " + ENDC)
+#        for test_id in valid_test_id_list:
+#          if 'validation' not in test_id.split('_') and 'ble' in test_id.split('_'):
+#              print(CYAN_BOLD + test_id + ENDC)
+#        print()
+#        exit_flag = True
 
     # Ensure fim_run_dir exists.
     if not os.path.exists(os.path.join(os.environ['outputDataDir'], args['fim_run_dir'])):

diff --git a/tests/synthesize_test_cases.py b/tests/synthesize_test_cases.py
@@ -7,7 +7,7 @@
 import csv
 
 from run_test_case import run_alpha_test
-from utils.shared_variables import TEST_CASES_DIR, PREVIOUS_FIM_DIR, OUTPUTS_DIR
+from utils.shared_variables import TEST_CASES_DIR, PREVIOUS_FIM_DIR, OUTPUTS_DIR, AHPS_BENCHMARK_CATEGORIES
 
 
 def create_master_metrics_csv(master_metrics_csv_output):
@@ -57,7 +57,7 @@ def create_master_metrics_csv(master_metrics_csv_output):
                         ]
 
     additional_header_info_prefix = ['version', 'nws_lid', 'magnitude', 'huc']
-    list_to_write = [additional_header_info_prefix + metrics_to_write + ['full_json_path'] + ['flow'] + ['benchmark_source'] + ['extent_config']]
+    list_to_write = [additional_header_info_prefix + metrics_to_write + ['full_json_path'] + ['flow'] + ['benchmark_source'] + ['extent_config'] + ["calibrated"]]
 
     versions_to_aggregate = os.listdir(PREVIOUS_FIM_DIR)
 
@@ -77,17 +77,20 @@ def create_master_metrics_csv(master_metrics_csv_output):
 
                     for magnitude in ['100yr', '500yr']:
                         for version in versions_to_aggregate:
-                            if '_fr_' in version:
+                            if '_fr' in version:
                                 extent_config = 'FR'
-                            if '_ms_' in version:
+                            elif '_ms' in version:
                                 extent_config = 'MS'
-                            if '_fr_' or '_ms_' not in version:
+                            else:
                                 extent_config = 'FR'
+                            if "_c" in version and version.split('_c')[1] == "":
+                                calibrated = "yes"
+                            else:
+                                calibrated = "no"
                             version_dir = os.path.join(official_versions, version)
                             magnitude_dir = os.path.join(version_dir, magnitude)
 
                             if os.path.exists(magnitude_dir):
-
                                 magnitude_dir_list = os.listdir(magnitude_dir)
                                 for f in magnitude_dir_list:
                                     if '.json' in f:
@@ -104,33 +107,37 @@ def create_master_metrics_csv(master_metrics_csv_output):
                                             sub_list_to_append.append(flow)
                                             sub_list_to_append.append(benchmark_source)
                                             sub_list_to_append.append(extent_config)
+                                            sub_list_to_append.append(calibrated)
 
                                             list_to_write.append(sub_list_to_append)
                 except ValueError:
                     pass
 
-        if benchmark_source in ['nws', 'usgs']:
-            test_cases_list = os.listdir(TEST_CASES_DIR)
+        if benchmark_source in AHPS_BENCHMARK_CATEGORIES:
+            test_cases_list = os.listdir(benchmark_test_case_dir)
 
             for test_case in test_cases_list:
                 try:
                     int(test_case.split('_')[0])
 
                     huc = test_case.split('_')[0]
-                    official_versions = os.path.join(benchmark_test_case_dir, test_case, 'performance_archive', 'previous_versions')
+                    official_versions = os.path.join(benchmark_test_case_dir, test_case, 'official_versions')
 
                     for magnitude in ['action', 'minor', 'moderate', 'major']:
                         for version in versions_to_aggregate:
-                            if '_fr_' in version:
+                            if '_fr' in version:
                                 extent_config = 'FR'
-                            if '_ms_' in version:
+                            elif '_ms' in version:
                                 extent_config = 'MS'
-                            if '_fr_' or '_ms_' not in version:
+                            else:
                                 extent_config = 'FR'
+                            if "_c" in version and version.split('_c')[1] == "":
+                                calibrated = "yes"
+                            else:
+                                calibrated = "no"
 
                             version_dir = os.path.join(official_versions, version)
                             magnitude_dir = os.path.join(version_dir, magnitude)
-
                             if os.path.exists(magnitude_dir):
                                 magnitude_dir_list = os.listdir(magnitude_dir)
                                 for f in magnitude_dir_list:
@@ -159,6 +166,7 @@ def create_master_metrics_csv(master_metrics_csv_output):
                                             sub_list_to_append.append(flow)
                                             sub_list_to_append.append(benchmark_source)
                                             sub_list_to_append.append(extent_config)
+                                            sub_list_to_append.append(calibrated)
 
                                             list_to_write.append(sub_list_to_append)
                 except ValueError:
@@ -202,7 +210,7 @@ def process_alpha_test(args):
     parser.add_argument('-b','--benchmark-category',help='A benchmark category to specify. Defaults to process all categories.',required=False, default="all")
     parser.add_argument('-o','--overwrite',help='Overwrite all metrics or only fill in missing metrics.',required=False, action="store_true")
     parser.add_argument('-m','--master-metrics-csv',help='Define path for master metrics CSV file.',required=True)
-        
+
     # Assign variables from arguments.
     args = vars(parser.parse_args())
     config = args['config']
@@ -212,6 +220,10 @@ def process_alpha_test(args):
     benchmark_category = args['benchmark_category']
     overwrite = args['overwrite']
     master_metrics_csv = args['master_metrics_csv']
+
+    if overwrite:
+        if input("Are you sure you want to overwrite metrics? y/n: ") == "n":
+            quit
 
     # Default to processing all possible versions in PREVIOUS_FIM_DIR. Otherwise, process only the user-supplied version.
     if fim_version != "all":
@@ -265,7 +277,6 @@ def process_alpha_test(args):
                         if not os.path.exists(fim_run_dir):
                             if config == 'DEV':
                                 fim_run_dir = os.path.join(OUTPUTS_DIR, version, current_huc[:6])
-                                print(fim_run_dir)
                             elif config == 'PREV':
                                 fim_run_dir = os.path.join(PREVIOUS_FIM_DIR, version, current_huc[:6])  
 

diff --git a/tests/utils/shared_variables.py b/tests/utils/shared_variables.py
@@ -1,16 +1,16 @@
 import os
 
+# Environmental variables and constants.
 TEST_CASES_DIR = r'/data/test_cases/'
 PREVIOUS_FIM_DIR = r'/data/previous_fim'
 OUTPUTS_DIR = os.environ['outputDataDir']
 INPUTS_DIR = r'/data/inputs'
+AHPS_BENCHMARK_CATEGORIES = ['usgs', 'nws']
 PRINTWORTHY_STATS = ['CSI', 'TPR', 'TNR', 'FAR', 'MCC', 'TP_area_km2', 'FP_area_km2', 'TN_area_km2', 'FN_area_km2', 'contingency_tot_area_km2', 'TP_perc', 'FP_perc', 'TN_perc', 'FN_perc']
 GO_UP_STATS = ['CSI', 'TPR', 'MCC', 'TN_area_km2', 'TP_area_km2', 'TN_perc', 'TP_perc', 'TNR']
 GO_DOWN_STATS = ['FAR', 'FN_area_km2', 'FP_area_km2', 'FP_perc', 'FN_perc']
-AHPS_BENCHMARK_CATEGORIES = ['usgs', 'ble']
-
-
 
+# Colors.
 ENDC = '\033[m'
 TGREEN_BOLD = '\033[32;1m'
 TGREEN = '\033[32m'