The purpose of this document is to delineate the processing steps used to join lake identifiers from 5 data products into a single lake identifier table called LAGOS Lake Link (working name).
“Lake”: Permanent lake or reservoir.
LAGOS Lake Link is a tabular dataset (a crosswalk table) that can be used to connect one lake-related dataset to another for many common lake datasets. Several lake datasets are in common use on their own or as a base for scientific data products and LAGOS Lake Link is intended to make it easier to combine lake-related data between multiple sources. The table can be searched to find identifiers and location for a particular lake, or it can be used in data join operations to convert identifiers en masse.
There are two main spatial representations used for lake locations: polygons, used in the various NHD products (National Hydrography Dataset), and points, used in other national datasets. In addition to the spatial representations, there are a variety of identifiers found among the datasets portraying lakes in the United States: state agency IDs, Water Quality Portal MonitoringLocationIdentifier (from systems such as STORET and NWIS), NHDPlusV2 COMID, NHD ReachCode, NHD-HR Permanent_Identifier, GNIS_ID. Finally, lakes are sometimes named and can be identified by their name alone if it unique, or their name and geographic context.
There are several obstacles to reliably identifying common lakes between these datasets:
When I say “inadequate” or “inconsistent” here, I am describing the quality relative to my specific task of connecting datasets. Independently, these datasets have high standards for data quality–but two datasets may use, for example, different definitions of the lake entity, and that is where problems arise for my task.
5 main datasets will be introduced to create the crosswalk: LAGOS (US), GNIS, WQP, NHDPlusV2 (medium-resolution), and LAGOS-NE. For each dataset, the workflow is as follows:
Finally, the results of the multiple joins in step 4 will be merged into a single crosswalk table with only the necessary fields remaining.
The LAGOS-US lake population is the focal dataset of the LAGOS Lake Link crosswalk (top center, Figure ). We will connect all other datasets to this one in the crosswalk. Lakes will only This dataset is a subset of the NHD High Resolution NHDWaterbody layers (see “Filter” section for details) representing permanent, non-artificial LakePond and Reservoir waterbodies. Currently, LAGOS-US is under development and while the lake polygon dataset is completed, much of the database remains to be built in 2018 and 2019.
The primary identifier for lakes in this dataset is lagoslakeid. A secondary unique identifier is named nhdid in the “locus” table, which corresponds to the Permanent_Identifier in the NHD and in the GIS layer imported here.
We’ll import the GIS polygon layer prepared for LAGOS-US. Glimpse the polygon layer:
Observations: 479,950
The LAGOS-US dataset has already been filtered from the source dataset of all NHDWaterbody features to generate the final target lake population. The code is not included but the processing rules are summarized below.
Criteria for selecting NHD lake polygons for inclusion in LAGOS-US were as follows. Included lakes must:
The number of LAGOS-US lakes after filtering is 479950.
We will select several columns and create two alternate representations of the LAGOS-US dataset: centroids and no geometry.
lagos <- lagos_sf %>%
select(lagoslakeid,
Permanent_Identifier,
GNIS_ID,
GNIS_Name,
ReachCode,
STATE,
AreaSqKm,
FDate,
LAGOS_CountyName = NAMELSAD,
LAGOS_CountyFIPS = FIPS)
lagos_centroid <- lagos_sf %>%
st_centroid() %>%
st_transform(NAD83) %>%
mutate(LAGOS_LatitudeNAD83 = st_coordinates(.)[,'Y'],
LAGOS_LongitudeNAD83 = st_coordinates(.)[,'X']) %>%
select(lagoslakeid, LAGOS_LatitudeNAD83, LAGOS_LongitudeNAD83)
lagos_df <- lagos %>% st_set_geometry(NULL)
The GNIS dataset contains place names for a variety of geographic features, including lakes, in the United States. Locations are represented as latitude/longitude pairs. Most NHD lakes already have a GNIS name assigned, but in some cases where lakes had multiple names, they were not assigned a name in the NHD. To populate the lake name as fully as possible, we searched the GNIS dataset to both add missing names where available and to document lakes with multiple names, which we then retain in LAGOS-US.
Glimpse the original GNIS table.
Observations: 2,276,506
Variables: 20
$ FEATURE_ID <int> 0, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412...
$ FEATURE_NAME <chr> "Bruceville Cemetery", "Agua Sal Creek", "Agua Sal Wash", "Aguaje Draw"...
$ FEATURE_CLASS <chr> "Cemetery", "Stream", "Valley", "Valley", "Park", "Stream", "Stream", "...
$ STATE_ALPHA <chr> "CO", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ", "AZ",...
$ STATE_NUMERIC <chr> "08", "04", "04", "04", "04", "04", "04", "04", "04", "04", "04", "04",...
$ COUNTY_NAME <chr> "Boulder", "Apache", "Apache", "Apache", "Maricopa", "Graham", "Apache"...
$ COUNTY_NUMERIC <chr> "013", "001", "001", "001", "013", "009", "001", "019", "017", "005", "...
$ PRIMARY_LAT_DMS <chr> "400058N", "362740N", "363246N", "343417N", "331455N", "322815N", "3552...
$ PRIM_LONG_DMS <chr> "1051226W", "1092842W", "1093103W", "1091313W", "1124625W", "1095610W",...
$ PRIM_LAT_DEC <dbl> 40.01610, 36.46111, 36.54611, 34.57143, 33.24865, 32.47090, 35.87501, 3...
$ PRIM_LONG_DEC <dbl> -105.2072, -109.4784, -109.5176, -109.2204, -112.7735, -109.9362, -109....
$ SOURCE_LAT_DMS <chr> NA, "362053N", "362740N", "344308N", NA, "323048N", "354903N", "315820N...
$ SOURCE_LONG_DMS <chr> NA, "1090915W", "1092842W", "1085826W", NA, "1095233W", "1093001W", "11...
$ SOURCE_LAT_DEC <dbl> NA, 36.34806, 36.46111, 34.71880, NA, 32.51340, 35.81750, 31.97230, 34....
$ SOURCE_LONG_DEC <dbl> NA, -109.1543, -109.4784, -108.9739, NA, -109.8759, -109.5003, -111.391...
$ ELEV_IN_M <int> 1598, 1645, 1597, 1750, 231, 1339, 1799, 591, 1494, 1435, 1506, 81, 144...
$ ELEV_IN_FT <int> 5243, 5397, 5239, 5741, 758, 4393, 5902, 1939, 4902, 4708, 4941, 266, 4...
$ MAP_NAME <chr> "Niwot", "Fire Dance Mesa", "Little Round Rock", "Kearn Lake", "Spring ...
$ DATE_CREATED <chr> "05/04/2015", "02/08/1980", "02/08/1980", "02/08/1980", "02/08/1980", "...
$ DATE_EDITED <chr> "10/22/2015", NA, NA, "01/14/2008", NA, NA, NA, NA, NA, NA, NA, NA, NA,...
Included GNIS sites must:
The number of GNIS sites for U.S. lakes is 138055.
According to the metadata, the coordinates use the NAD83 datum. Convert the XY data to points.
Select the FEATURE_ID, FEATURE_NAME, and FEATURE_CLASS.
gnis <- gnis_sf %>%
select(FEATURE_ID, FEATURE_NAME, FEATURE_CLASS)
First, join LAGOS-US to GNIS based on the shared GNIS_ID.
GNIS Names contained by the polygon are assigned to the lake. A new column, “LAGOS Lake Name” is created and multiple names for a lake are separated by semi-colons. This will eliminate the 1:many relationship created in the join.
Select only the desired names and identifiers.
lagos_gnis <- lagos_gnis_3 %>%
st_set_geometry(NULL) %>%
select(lagoslakeid, Permanent_Identifier, GNIS_ID, GNIS_Name, ReachCode, STATE, LAGOS_LakeName, AreaSqKm)
The WQP API allows queries of all water quality monitoring sites that are submitted to major national databases. Sites are described with identifiers, coordinates, and names, but are not definitively linked to any polygon representation of lakes to our knowledge. The goal here is to match the WQP identifiers with lakes in LAGOS-US.
Glimpse the original WQP table.
Observations: 2,300,409
Variables: 36
$ OrganizationIdentifier <chr> "ARS", "ARS", "ARS", "ARS", "ARS", "ARS...
$ OrganizationFormalName <chr> "Agricultural Research Service", "Agric...
$ MonitoringLocationIdentifier <chr> "ARS-IASF-IABC264", "ARS-IASF-IABC274",...
$ MonitoringLocationName <chr> "IABC264", "IABC274", "IABC350", "IASF2...
$ MonitoringLocationTypeName <chr> "Land Runoff", "Land Runoff", "River/St...
$ MonitoringLocationDescriptionText <chr> "A drainage ditch monitoring site on 19...
$ HUCEightDigitCode <chr> "07080207", "07080207", "07080207", "07...
$ DrainageAreaMeasure.MeasureValue <dbl> NA, NA, 18210.0, NA, NA, NA, NA, NA, NA...
$ DrainageAreaMeasure.MeasureUnitCode <chr> NA, NA, "ha", NA, NA, NA, NA, NA, NA, N...
$ ContributingDrainageAreaMeasure.MeasureValue <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ ContributingDrainageAreaMeasure.MeasureUnitCode <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ LatitudeMeasure <dbl> 42.41918, 42.40710, 42.33911, 42.54317,...
$ LongitudeMeasure <dbl> -93.18051, -93.20679, -93.12500, -93.47...
$ SourceMapScaleNumeric <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ HorizontalAccuracyMeasure.MeasureValue <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ HorizontalAccuracyMeasure.MeasureUnitCode <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ HorizontalCollectionMethodName <chr> "GPS Code (Pseudo Range) Differential",...
$ HorizontalCoordinateReferenceSystemDatumName <chr> "NAD83", "NAD83", "NAD83", "NAD83", "NA...
$ VerticalMeasure.MeasureValue <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ VerticalMeasure.MeasureUnitCode <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ VerticalAccuracyMeasure.MeasureValue <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ VerticalAccuracyMeasure.MeasureUnitCode <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ VerticalCollectionMethodName <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ VerticalCoordinateReferenceSystemDatumName <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ CountryCode <chr> "US", "US", "US", "US", "US", "US", "US...
$ StateCode <chr> "19", "19", "19", "19", "19", "19", "19...
$ CountyCode <chr> "083", "083", "083", "083", "079", "083...
$ AquiferName <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ FormationTypeText <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ AquiferTypeName <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ ConstructionDateText <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ WellDepthMeasure.MeasureValue <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ WellDepthMeasure.MeasureUnitCode <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ WellHoleDepthMeasure.MeasureValue <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ WellHoleDepthMeasure.MeasureUnitCode <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
$ ProviderName <chr> "STEWARDS", "STEWARDS", "STEWARDS", "ST...
Included WQP sites must:
The number of WQP sites meeting these criteria is 131005.
There is a column in WQP data to indicate the coordinate reference system (CRS). However, not all sites had a value in this column. Prior analysis showed that the most commonly indicated column was NAD83, and so for all sites with an unknown or rare CRS (distant U.S. territories), the NAD83 datum was imputed instead.
For sites with one of the following other datums, the actual datum specification is preserved: NAD27, WGS84. In order to manage the projections before transforming all the data to a common projection, Albers Equal Area Conic (USGS), we split up the dataset, projected the sites in each datum, converted all rows to Albers USGS, and then reunited all rows at the end.
wqp <- wqp_sf %>%
select(OrganizationIdentifier, MonitoringLocationIdentifier, MonitoringLocationName, ProviderName, MonitoringLocationTypeName)
Examination of shared words in the names: All lakes where two names are available but don’t share words. Some of these lakes have names following some kind of technical form, but others seems to suggest mis-referenced lakes even though the point for the site falls inside the lake. Some examples (zoom in to see NHD detail appear):
Join them all up into one and enforce the rules.
How many joins were made for each method?
Some lakes have a very high number of sampling sites. This map shows some examples of lakes with over 50 sampling sites stored.
Retain only columns relevant to crosswalk.
lagos_wqp <- lagos_wqp_all %>%
st_set_geometry(NULL) %>%
select(OrganizationIdentifier, MonitoringLocationIdentifier, MonitoringLocationName, ProviderName, Linked_lagoslakeid)
The medium-resolution NHD contains a popular medium-resolution representation of lakes and their hydrographic context. Because the high-resolution NHD data were used to generate LAGOS-US, we must find a connection between the datasets. This connection is the reach code, a 14-digit identifier used to identify each reach as part of a linear referencing system. ReachCodes must sometimes be modified (split, joined, re-assigned, deleted). The NHDReachCrossReference table tracks these changes and shows the new and old ReachCode.
The version of the NHDReachCrossReference table that we use has already been filtered to include only the reach codes for lakes and reservoirs in the NHD HR (e.g. streams, wetlands, etc. were removed) to reduce the table size. You could use the full original table with no change to the results.
Glimpse the NHD table.
Observations: 448,512
Variables: 22
$ COMID <int> 806161, 806167, 806169, 806175, 806179, 806181, 806183, 806189, 806197, 80...
$ FDATE <dttm> 1999-05-27, 1999-05-27, 1999-05-27, 1999-05-27, 1999-05-27, 1999-05-27, 1...
$ RESOLUTION <chr> "Medium", "Medium", "Medium", "Medium", "Medium", "Medium", "Medium", "Med...
$ GNIS_ID <dbl> NA, NA, 564483, 563828, 575611, NA, NA, NA, NA, 572518, NA, NA, 563822, NA...
$ GNIS_NAME <chr> " ", " ", "Cranberry Pond", "Chandler Lake", "Smith Brook Pond", " ", " ",...
$ AREASQKM <dbl> 0.013, 0.055, 0.176, 1.572, 0.069, 0.162, 0.136, 0.187, 0.049, 0.018, 0.01...
$ ELEVATION <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ REACHCODE <chr> "01010004001814", "01010004001819", "01010004001817", "01010004001854", "0...
$ FTYPE <chr> "LakePond", "LakePond", "LakePond", "LakePond", "LakePond", "LakePond", "L...
$ FCODE <int> 39004, 39004, 39004, 39004, 39004, 39004, 39004, 39004, 39004, 39004, 3900...
$ Shape_Length <dbl> 0.005468933, 0.021807035, 0.021755760, 0.087375064, 0.017330298, 0.0530475...
$ Shape_Area <dbl> 1.513469e-06, 6.416618e-06, 2.057280e-05, 1.841445e-04, 8.053727e-06, 1.89...
$ ONOFFNET <int> 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,...
$ PurpCode <chr> " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ",...
$ PurpDesc <chr> " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ",...
$ MeanDepth <dbl> 0.09450471, 0.09872492, 0.28167548, NA, 0.56829879, 0.13899818, 0.55463734...
$ LakeVolume <dbl> 1220.5177, 5407.7469, 49470.2953, NA, 39081.4758, 22468.7585, 75691.3313, ...
$ MaxDepth <dbl> 0.23044897, 0.23288202, 0.63270356, NA, 1.09330535, 0.27875821, 1.26695295...
$ MeanDUsed <dbl> 0.09450471, 0.09872492, 0.28167548, NA, 0.56829879, 0.13899818, 0.55463734...
$ MeanDCode <chr> "4", "4", "4", NA, "4", "4", "4", "4", "4", "4", "4", NA, "4", "4", "4", "...
$ LakeArea <dbl> 12914.89, 54775.91, 175628.68, NA, 68769.23, 161647.88, 136469.96, 187427....
$ Shape <sf_geometry [degree]> MULTIPOLYGON (((-68.31665 4..., MULTIPOLYGON (((-68.3228 ...
Glimpse the NHDReachCrossReference table.
Observations: 5,856,425
Variables: 4
$ OldReachCode <fct> 01010003000640, 01010004001528, 01010004001679, 01010004001536, 0101000400...
$ OldReachDate <dttm> 1997-05-21, 1997-05-21, 1997-05-21, 1997-05-21, 1997-05-21, 1997-05-21, 1...
$ NewReachCode <fct> 01010003000737, 01010004001871, 01010004001873, 01010004001874, 0101000400...
$ NewReachDate <dttm> 1998-07-04, 1998-07-04, 1998-07-04, 1998-07-04, 1998-07-04, 1998-07-04, 1...
We can’t use the join method to connect to any NHDPlusV2 lakes that don’t have REACHCODE populated. We will revisit these lakes along with others that won’t connect later.
These data are already spatial polygons and we want a dataset with no geometry instead.
nhd_plus <- nhd_plus_df %>%
select(COMID, FDATE, GNIS_ID, GNIS_NAME, REACHCODE, AREASQKM)
This join creates many:many relationships. A scant few ReachCodes in the LAGOS-US population are associated with two Permanent_Identifiers before the join. Most reach codes don’t have an entry in the cross reference table, suggesting that lakes still holding their original reach code should have the same reach code in any version of the NHD.
346721 reach codes have only a new reach code in NHDReachCrossReference and the OldReachCode is missing. In other words, about 72.24% of LAGOS-US lakes have never changed reach codes. About a tenth of the lakes have no entry in NHDReachCrossReference, suggesting this table is usually but not always populated when new reach codes are assigned.
Numerous lakes aren’t able to be connected between the NHDPlusV2 and the NHD HR. The HR contains more small lakes, so let’s look for a minute only at lakes over 10 hectares. Here are a few examples of lakes where the old reach codes are unavailable or not the target reach code.
For those lakes that didn’t link between NHD versions, try linking them through GNIS IDs.
It works. 5316 lakes that couldn’t be linked with NHDReachCrossReference can be linked with the GNIS_ID value.
lagos_nhd_not_missing <- lagos_nhd_all %>%
filter(!is.na(COMID))
lagos_nhd <- lagos_df %>%
left_join(lagos_nhd_not_missing, by = "lagoslakeid") %>% # removes OldReachCodes that don't link but keeps all lakes
left_join(nhd_gnis_nhd, by = "lagoslakeid") %>%
mutate(NHDPlusv2_COMID = coalesce(COMID.x, COMID.y)) %>%
mutate(NHDPlusv2_ReachCode = coalesce(OldReachCode, REACHCODE)) %>%
mutate(NHDPlusv2_GNIS_Name = coalesce(GNIS_NAME.x, GNIS_NAME.y)) %>% #all-caps is medium res
mutate(NHDPlusv2_AreaSqKm = coalesce(AREASQKM.x, AREASQKM.y)) %>%
select(lagoslakeid, NHDPlusv2_COMID, NHDPlusv2_ReachCode, NHDPlusv2_GNIS_Name, NHDPlusv2_AreaSqKm)
LAGOS-NE was the first release from the LAGOS project describing lakes in 17 states in the northeastern states. It is similar in most ways to LAGOS-US, but the NHD snapshots were taken several years ago. This dataset can be downloaded and unzipped from [https://portal.edirepository.org/nis/mapbrowse?packageid=edi.98.1].
When the LAGOS-US lagoslakeid values were generated, they were joined on Permanent_Identifier to the LAGOS-NE, so most of the lagoslakeid values should not have changed. Nonetheless, some could not be connected this way.
Glimpse the original LAGOS-NE dataset.
Observations: 141,265
Variables: 35
$ ComID <int> 145061933, 145050791, 145050750, 145050563, 142825415, 142825576, ...
$ Permanent_Identifier <fct> 145061933, 145050791, 145050750, 145050563, 142825415, 142825576, ...
$ FDate <dttm> 2006-01-11 16:55:46, 2006-01-11 15:27:23, 2006-01-11 15:26:37, 20...
$ GNIS_ID <dbl> NA, 4781, 4907, NA, NA, 5045, NA, 3906, NA, NA, NA, NA, 4214, NA, ...
$ GNIS_Name <fct> NA, Lower Hudson Pond, Mink Marsh Pond, NA, NA, Mud Pond, NA, Deep...
$ AreaSqKm <dbl> 0.039, 0.355, 0.073, 0.017, 0.021, 0.066, 0.018, 0.025, 0.014, 0.0...
$ Elevation <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ ReachCode <fct> 01010005000620, 01010004001847, 01010004001838, 01010004005842, 01...
$ FType <int> 390, 390, 390, 390, 390, 390, 390, 390, 390, 390, 390, 390, 390, 3...
$ FCode <int> 39004, 39004, 39004, 39004, 39004, 39004, 39004, 39004, 39004, 390...
$ Lake_Area_Ha <dbl> 3.888402, 35.450899, 7.317129, 1.684377, 2.088132, 6.614470, 1.817...
$ LakeConnectivity <fct> DR_Stream, DR_LakeStream, DR_Stream, Headwater, Headwater, DR_Stre...
$ HU4_ZoneID <fct> HU4_1, HU4_1, HU4_1, HU4_1, HU4_1, HU4_1, HU4_1, HU4_1, HU4_1, HU4...
$ HU4_Name <fct> St. John, St. John, St. John, St. John, St. John, St. John, St. Jo...
$ HU6_ZoneID <fct> HU6_1, HU6_1, HU6_1, HU6_1, HU6_1, HU6_1, HU6_1, HU6_1, HU6_1, HU6...
$ HU6_Name <fct> St. John, St. John, St. John, St. John, St. John, St. John, St. Jo...
$ HU8_ZoneID <fct> HU8_5, HU8_4, HU8_4, HU8_4, HU8_601, HU8_601, HU8_601, HU8_5, HU8_...
$ HU8_Name <fct> Meduxnekeag, Aroostook, Aroostook, Aroostook, St. Francis River-Sa...
$ HU12_ZoneID <fct> HU12_17971, HU12_17917, HU12_17915, HU12_17946, HU12_17762, HU12_1...
$ HU12_Name <fct> South Branch Meduxnekeag River, Mooseleuk Lake, Bartlett Brook-Spr...
$ EDU_ZoneID <fct> EDU_79, EDU_79, EDU_79, EDU_79, EDU_79, EDU_79, EDU_79, EDU_79, ED...
$ EDU_Name <fct> Upper St. John - Aroostook, Upper St. John - Aroostook, Upper St. ...
$ COUNTY_ZoneID <fct> County_303, County_312, County_312, County_303, County_303, County...
$ COUNTY_Name <fct> Aroostook County, Piscataquis County, Piscataquis County, Aroostoo...
$ State_Name <fct> Maine, Maine, Maine, Maine, Maine, Maine, Maine, Maine, Maine, Mai...
$ STATE_ZoneID <fct> State_1, State_1, State_1, State_1, State_1, State_1, State_1, Sta...
$ Shape_Length <dbl> 997.0809, 3626.5048, 2472.0295, 1602.0563, 1056.0734, 1325.3148, 8...
$ Shape_Area <dbl> 38884.02, 354508.99, 73171.29, 16843.77, 20881.32, 66144.70, 18171...
$ HU4 <fct> 0101, 0101, 0101, 0101, 0101, 0101, 0101, 0101, 0101, 0101, 0101, ...
$ HU6 <fct> 010100, 010100, 010100, 010100, 010100, 010100, 010100, 010100, 01...
$ HU12 <fct> 010100050102, 010100040303, 010100040301, 010100040904, 0101000804...
$ HU8 <fct> 01010005, 01010004, 01010004, 01010004, 01010008, 01010008, 010100...
$ STATE <fct> ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME, ME...
$ lagoslakeid <int> 29903, 79876, 52999, 97563, 89329, 125499, 132198, 131140, 133209,...
$ Shape <sf_geometry [m]> MULTIPOLYGON (((2148592 288..., MULTIPOLYGON (((204595...
No need to filter. These lakes had essentially the same criteria for inclusion as LAGO-US, above, except that they were confined to a 17-state region in the northeast U.S.
lagosNE <- lagosNE_orig %>%
select(LAGOSNE_lagoslakeid = lagoslakeid, LAGOSNE_PermanentIdentifier = Permanent_Identifier, LAGOSNE_ReachCode = ReachCode, GNIS_ID, GNIS_Name) %>%
st_set_geometry(NULL)
Unlike with NHDPlusV2, these two datasets shared the identifier Permanent_Identifier and we will try to join lakes on that column first. The subsequent two joins parallel the joins with NHDPlusV2, above. We will reuse the lagos_nhdx table.
After trying the code below (currently commented out), it seems upon further observation that the GNIS_ID values in LAGOS-NE have no useful relation to the latest GNIS_ID values.
lagosUS_lagosNE <- lagos_lagosNE_all %>%
filter(!is.na(LAGOSNE_lagoslakeid)) %>%
arrange(lagoslakeid)
The lakes in the NLA are identified with their NHDPlus COMID and ReachCode. It should be simple to add them to the crosswalk by doing a 1:1 join with the NHDPlusV2 COMID.
Import the two NLA files. It seems that the 2012 file already manages the link between the 2012 sites back to the 2007 sites.
Observations: 2,764
Variables: 111
$ SITE_ID <chr> "NLA12_AL-101", "NLA12_AL-101", "NLA12_AL-102", "NLA12_AL-103", "NLA12...
$ VISIT_NO <int> 1, 2, 1, 1, 1, 1, 2, NA, NA, NA, NA, NA, NA, 1, 1, 1, 1, 1, 1, 2, 1, 1...
$ UID <int> 6300, 7374, 7228, 7292, 7368, 6267, 7394, NA, NA, NA, NA, NA, NA, 7375...
$ DATE_COL <chr> "5/24/2012", "7/21/2012", "7/17/2012", "7/19/2012", "7/23/2012", "5/23...
$ AGGR_ECO3_2015 <chr> "EHIGH", "EHIGH", "EHIGH", "EHIGH", "PLNLOW", "PLNLOW", "PLNLOW", "PLN...
$ AGGR_ECO9_2015 <chr> "SAP", "SAP", "SAP", "SAP", "CPL", "CPL", "CPL", "CPL", "CPL", "SAP", ...
$ AREA_HA <dbl> 1301.871970, 1301.871970, 72.950420, 10363.258480, 4731.474336, 32.561...
$ AREACLS <chr> "F:500+", "F:500+", "D:50-100", "F:500+", "F:500+", "C:10-50", "C:10-5...
$ BORD_LAKE <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No"...
$ CAT_UNIQUE <chr> "NLA07RVT", "NLA07RVT", "NLA07RVT", "NLA07RVT", "NLA07RVT", "NLA12NEW"...
$ CH0712_CAT <chr> "NLA07_RVT12", "NLA07_RVT12", "NLA07_RVT12", "NLA07_RVT12", "NLA07_RVT...
$ CH0712_USE <chr> "Include", "Include", "Include", "Include", "Include", "Include", "Inc...
$ CH0712_WGT <dbl> 20.288885, 20.288885, 46.617902, 51.331899, 101.491046, 83.376963, 83....
$ CNTYNAME <chr> "TUSCALOOSA", "TUSCALOOSA", "CULLMAN", "TALLADEGA", "SUMTER", "WASHING...
$ COMID2007 <int> 21656744, 21656744, 18587568, 22273142, 21677972, 21639448, 21639448, ...
$ COMID2012 <int> 166759236, 166759236, 18587568, 166759059, 21677972, 21639448, 2163944...
$ COMIDS2007 <chr> "21656744", "21656744", "18587568", "22273142", "21677972", "21639448"...
$ DES_FTYPE <chr> "Lake/Pond: Hydrographic Category = Perennial; Stage = Average Water E...
$ DSGN12 <chr> "Included", "Included", "Included", "Included", "Included", "Included"...
$ ELEVATION <dbl> 57.03, 57.03, 247.04, 141.76, 22.28, 50.83, 50.83, 96.12, 77.74, 181.4...
$ EPA_REG <chr> "Region_4", "Region_4", "Region_4", "Region_4", "Region_4", "Region_4"...
$ EVAL_NAME <chr> "Holt Lock and Dam", "Holt Lock and Dam", "George Lake", "Logan Martin...
$ EVALSTAT <chr> "TargetSampleable", "TargetSampleable", "TargetSampleable", "TargetSam...
$ FEOW_ID <int> 153, 153, 153, 153, 153, 153, 153, 154, 153, 153, 154, 152, 155, 153, ...
$ FS_EW <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ FW_ECO3 <chr> "EHIGH", "EHIGH", "EHIGH", "EHIGH", "PLNLOW", "PLNLOW", "PLNLOW", "PLN...
$ FW_ECO9 <chr> "SAP", "SAP", "SAP", "SAP", "CPL", "CPL", "CPL", "CPL", "CPL", "SAP", ...
$ GNIS_ID <int> NA, NA, 131985, NA, NA, 117983, 117983, NA, 154123, NA, NA, NA, 133275...
$ GNIS_NAME <chr> NA, NA, "George Lake", NA, NA, "Emmit Wood Lake", "Emmit Wood Lake", N...
$ HUC2 <int> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 6, 3, 3, 3, 3, 3, 3, 11, 11, 8, 8, 11...
$ HUC8 <int> 3160112, 3160112, 3160109, 3150106, 3160201, 3160203, 3160203, 3140202...
$ LAKE_ORIGIN <chr> "MAN_MADE", "MAN_MADE", "MAN_MADE", "MAN_MADE", "MAN_MADE", "MAN_MADE"...
$ LAKE_ORIGIN12 <chr> "RESERVOIR", "RESERVOIR", "RESERVOIR", "RESERVOIR", "RESERVOIR", "RESE...
$ LAT_DD83 <dbl> 33.29204, 33.29204, 34.22782, 33.76248, 32.51587, 31.62155, 31.62155, ...
$ LON_DD83 <dbl> -87.39138, -87.39138, -86.84345, -86.05316, -87.86108, -88.35374, -88....
$ MAJ_BASIN <chr> "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "3", "05_06", "3", "...
$ MDCATY <chr> "NonREF07", "NonREF07", "NonREF07", "NonREF07", "NonREF07", "(20,50]",...
$ MISS_BASIN <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "Yes...
$ NA_L1CODE <int> 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8...
$ NA_L2CODE <dbl> 8.4, 8.4, 8.4, 8.4, 8.3, 8.3, 8.3, 8.3, 8.3, 8.4, 8.3, 8.4, 8.3, 8.3, ...
$ NA_L3CODE <chr> "8.4.9", "8.4.9", "8.4.9", "8.4.1", "8.3.5", "8.3.5", "8.3.5", "8.3.5"...
$ NARS_NAME <chr> "Holt Lock and Dam", "Holt Lock and Dam", "George Lake", NA, NA, "Emmi...
$ NES_LAKE <chr> "NESLake", "NESLake", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ NESLAKE_ID <chr> "105", "105", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ NESSTORET <chr> "105", "105", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ...
$ OWN_TYPE <chr> "Non Federal", "Non Federal", "Non Federal", "Non Federal", "Non Feder...
$ OWNSHP <chr> "Non-Federal", "Non-Federal", "Non-Federal", "Non-Federal", "Non-Feder...
$ PANEL <chr> "NLA07RVT2", "NLA07RVT2", "NLA07RVT", "NLA07RVT", "NLA07RVT", "NLA12RV...
$ PERIM_KM <dbl> 104.2086662, 104.2086662, 11.8723053, 688.4591342, 698.1088203, 2.9989...
$ RCHCODE <chr> "3.16011E+12", "3.16011E+12", "3.16011E+12", "3.15011E+12", "3.16011E+...
$ RT_NLA12 <chr> "T", "T", "S", "S", "S", "S", "S", NA, NA, NA, NA, NA, NA, "R", "S", "...
$ SITEID_07 <chr> "NLA06608-2187", "NLA06608-2187", "NLA06608-0459", "NLA06608-2955", "N...
$ SITESAMP <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "N", "N", "N", "N", "N", "N", "Y", ...
$ SITETYPE <chr> "PROB", "PROB", "PROB", "PROB", "PROB", "PROB", "PROB", "PROB", "PROB"...
$ SIZE_CLASS <chr> ">50", ">50", ">50", ">50", ">50", "(20,50]", "(20,50]", "(1,4]", "(10...
$ ST_DSGN <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ ST_MDCATY <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ ST_PANEL <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ ST_SITEID <chr> "NLA12_AL-101", "NLA12_AL-101", "NLA12_AL-102", "NLA12_AL-103", "NLA12...
$ ST_STRATUM <chr> "AL_NLA07RVT", "AL_NLA07RVT", "AL_NLA07RVT", "AL_NLA07RVT", "AL_NLA07R...
$ ST_WGT_ALL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ ST_WGT_CAT <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ ST_WGT_DSGN <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
$ STATE <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL"...
$ STATE_NLA <chr> "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL"...
$ STATE_PCNT <int> 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, ...
$ STATECTY <int> 1073, 1073, 1043, 1055, 1063, 1129, 1129, 1031, 1085, 1009, 1053, 1071...
$ STATUS <chr> "Target_Sampled", "Target_Sampled", "Target_Sampled", "Target_Sampled"...
$ STATUS_ST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ STRATUM <chr> "AL_NLA07RVT", "AL_NLA07RVT", "AL_NLA07RVT", "AL_NLA07RVT", "AL_NLA07R...
$ TNT <chr> "Target", "Target", "Target", "Target", "Target", "Target", "Target", ...
$ TNT_ST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ URBAN <chr> "Yes", "Yes", "Yes", "Yes", "Yes", "No", "No", "No", "No", "No", "No",...
$ US_L3CODE <int> 68, 68, 68, 67, 65, 65, 65, 65, 65, 68, 65, 68, 65, 65, 65, 67, 45, 65...
$ US_L3CODE_2015 <int> 68, 68, 68, 67, 65, 65, 65, 65, 65, 68, 65, 68, 65, 65, 65, 67, 45, 65...
$ US_L4CODE <chr> "68f", "68f", "68d", "67g", "65b", "65f", "65f", "65d", "65a", "68d", ...
$ US_L4CODE_2015 <chr> "68f", "68f", "68d", "67g", "65p", "65f", "65f", "65d", "65a", "68d", ...
$ WGT_ALL <dbl> 20.288885, 20.288885, 46.617902, 51.331899, 101.491046, 83.376963, 83....
$ WGT_CAT <chr> "AL_>4", "AL_>4", "AL_>4", "AL_>4", "AL_>4", "AL_>4", "AL_>4", "AL_(1,...
$ WGT_DSGN <dbl> 3.666670, 3.666670, 3.666667, 3.666667, 3.666667, 21.875000, 21.875000...
$ XCOORD <dbl> 795432.2, 795432.2, 835851.3, 913172.8, 759549.6, 721685.4, 721685.4, ...
$ YCOORD <dbl> -434840.8, -434840.8, -326047.1, -370517.5, -524957.0, -628108.3, -628...
$ ECO_BIO <chr> "EHIGH", "EHIGH", "EHIGH", "EHIGH", "CPL", "CPL", "CPL", "CPL", "CPL",...
$ IM_COMMENT <chr> "No phab done on this visit; visit 2 assigned as index visit", "Assign...
$ INDEX_NLA <chr> "N", "Y", "Y", "Y", "Y", "N", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", ...
$ INDXSAMP_BENT <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_CHEM <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_CHLA <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_CORE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_ENTE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_MICR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_PHAB <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_SDHG <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ INDXSAMP_ZOOP <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ REF_NLA12_NUTR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "R", NA, NA, "R", ...
$ RT_NLA12_BENT <chr> "T", "T", "S", "S", "S", "S", "S", NA, NA, NA, NA, NA, NA, "R", "S", "...
$ RT_NLA12_ZOOP <chr> "T", "T", "S", "S", "S", "S", "S", NA, NA, NA, NA, NA, NA, "R", "S", "...
$ SAMPLED_BENT <chr> "Y", "Y", "Y", "N", "Y", "Y", "Y", NA, NA, NA, NA, NA, NA, "Y", "Y", "...
$ SAMPLED_CORE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_ENTE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_MICR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_PHAB <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_PHYT <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_PROFILE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_SDHG <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_ZOCN <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", NA, NA, NA, NA, NA, NA, "Y", "Y", "...
$ SAMPLED_ZOCR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_ZOFN <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", NA, NA, NA, NA, NA, NA, "Y", "Y", "...
$ SAMPLED_ZOFR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ SAMPLED_ZOOP <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ PUBLICATION_DATE <chr> "8/23/2016", "8/23/2016", "8/23/2016", "8/23/2016", "8/23/2016", "8/23...
Unfortunately, the RCHCODE values in the 2012 data that can be downloaded online were exported improperly, using scientific notation that cuts off the very important final digits. For example:
With luck, we can work around this issue.
No need.
No need.
nla2007 <- nla2007_orig %>%
distinct(SITE_ID, REACHCODE, COM_ID)
nla2012 <- nla2012_orig %>%
distinct(SITE_ID, SITEID_07, COMID2012, COMID2007, COMIDS2007, GNIS_ID, GNIS_NAME, NESSTORET)
We will use the nla2012 table as the left table in the join and also join through the original nhd_plus dataset so that we can produce a list of NLA lakes that could not be linked to a lagoslakeid. The NLA lakes that don’t link will be assigned a lagoslakeid through our manual linking process (that starts with a spatial link).
Check on how many lakes didn’t link through each step.
37 lakes out of 2664 NLA lakes cannot be linked to the NHDPlusV2. We did try the V1-to-V2 crosswalk and were not able to make any additional links. Far more lakes are unable to be linked to a lagoslakeid: 711. This is due to the poor match between the NHDPlusV2 and the LAGOS Lake population. If we improve that link, these numbers will increase. This work is proposed.
Select 2 columns from the NLA 2012 and lagoslakeid.
nla2012_lagos <- nla2012_lagos_join %>%
select(
NLA2012_SITE_ID = SITE_ID,
NLA2007_SITE_ID = SITEID_07,
lagoslakeid
)
For the final table, we’ll join all the previous results together with the entire LAGOS-US lake population on the left side of the join. Then, in order to facilitate a user understanding of 1-to-many relationships in the table, we will generate some count columns that indicate when multiple identifiers in one dataset relate to a single identifier in another. Finally, we will do the final selection of columns for the output and rename a few more columns to match the naming template.
Glimpse at the final table.
Observations: 569,329
Variables: 26
$ lagoslakeid <int> 31198, 31198, 31198, 31198, 31198, 31198, 122, 122, 12...
$ NHDHR_PermanentIdentifier <chr> "120020498", "120020498", "120020498", "120020498", "1...
$ NHDHR_ReachCode <chr> "04010201003330", "04010201003330", "04010201003330", ...
$ NHDHR_AreaSqKm <dbl> 2.745000, 2.745000, 2.745000, 2.745000, 2.745000, 2.74...
$ GNIS_ID <int> 649289, 649289, 649289, 649289, 649289, 649289, 653169...
$ GNIS_Name <chr> "Perch Lake", "Perch Lake", "Perch Lake", "Perch Lake"...
$ LAGOS_LakeName <chr> "Perch Lake", "Perch Lake", "Perch Lake", "Perch Lake"...
$ LAGOS_CountyName <chr> "Carlton County", "Carlton County", "Carlton County", ...
$ LAGOS_CountyFIPS <chr> "27017", "27017", "27017", "27017", "27017", "27017", ...
$ LAGOS_LatitudeNAD83 <dbl> 46.68943, 46.68943, 46.68943, 46.68943, 46.68943, 46.6...
$ LAGOS_LongitudeNAD83 <dbl> -92.67066, -92.67066, -92.67066, -92.67066, -92.67066,...
$ State <chr> "MN", "MN", "MN", "MN", "MN", "MN", "MN", "MN", "MN", ...
$ WQP_MonitoringLocationIdentifier <chr> "FONDULAC-114A", "FONDULAC-114B", "FONDULAC_WQX-114A",...
$ WQP_MonitoringLocationName <chr> "Perch Lake (North Basin)", "Perch Lake (South Basin)"...
$ WQP_ProviderName <chr> "STORET", "STORET", "STORET", "STORET", "STORET", "STO...
$ NHDPlusv2_COMID <int> 1776072, 1776072, 1776072, 1776072, 1776072, 1776072, ...
$ NHDPlusv2_ReachCode <chr> "04010201003330", "04010201003330", "04010201003330", ...
$ NHDPlusv2_AreaSqKm <dbl> 2.720, 2.720, 2.720, 2.720, 2.720, 2.720, 1.348, 1.348...
$ LAGOSNE_lagoslakeid <int> 31198, 31198, 31198, 31198, 31198, 31198, 122, 122, 12...
$ NLA2007_SITE_ID <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ NLA2012_SITE_ID <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
$ count_wqp_per_lagos_id <int> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 0, 0, 0, 0, 4, 4, ...
$ count_nhdplusv2_per_lagos_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 0, 1, 2, 2, ...
$ count_lagos_per_nhdplusv2_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, NA, 1, 1, 1,...
$ count_lagosNE_per_lagos_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
$ count_lagos_per_lagosNE_id <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
An example of a group of lakes in the final table.
Write out the final product.
These items can be described as designed limitations because this lake identifier crosswalk is designed primarily to work with LAGOS. Or–if there is enough interest from the preview group–we can consider adding these capabilities. G) A user cannot find identifiers pertaining to a lake that isn’t in LAGOS-US. SOLUTION: Use outer joins and allow the crosswalk to grow a fair amount in size to allow NULLs in any of the identifiers. Revise the documentation to reflect the expansion. H) A user cannot connect WQP sites to the NHDPlusV2 representation of lakes. Example: 2 NHDPlusV2 lakes to 1 LAGOS-US lake. SOLUTION: Refine the entity-relationship model to join between the WQP and NHDPlusV2 directly. Rework subsequent joins in the script to accomodate the change. This change probably won’t affect the overall size or usability of the crosswalk too much.
You can find a demonstration of LAGOS Lake Link here