Loading...
HomeMy WebLinkAboutNC0001422_Appdx H (1 of 2) - Compliance Values_20200814Corrective Action Plan Update August 2020 L.V. Sutton Energy Complex APPENDIX H SynTerra COMPLIANCE VALUE CALCULATION APPROACH AARCADIS�Design Consultancy fornaturaland I V r `� built assets ets Arcadis U.S., Inc. To: Copies: 630 Plaza Drive Scott Davies, Duke Energy Suite 200 Highlands Ranch Colorado 80129 Tel 720 344 3500 From: Fax 720 344 3535 Julie Sueker, PhD, PH, PE Date: Arcadis Project No.: June 22, 2020 30043729 Subject: COI Compliance Value Calculation Memo — L.V. Sutton Energy Complex This technical memorandum, titled COI Compliance Value Calculation Memo — L.V. Sutton Energy Complex, was prepared by Arcadis, U.S. on behalf of Duke Energy Progress, LLC (Duke Energy) and presents the statistical approach used for calculating compliance values. Compliance values are defined as either the 95 percent (%) lower confidence limit (LCL95) or central tendency value (CTV), for constituent of interest (COI) concentrations in groundwater for Duke Energy coal ash basin Sites in North Carolina. The North Carolina Department of Environmental Quality (NCDEQ) has requested' that the compliance value be calculated in the following manner: • LCL95 methodology will be used as described in the October 24, 2019 letter to Paul Draovitch, Duke Energy, signed by Jim Gregson, NCDEQ. Per guidance provided in this letter, constituent concentration data were evaluated for temporal trends prior to calculating LCL95 values. • CTV values will be used in place of LCL95 values if any of the following conditions are met: o Less than four valid data points are available (n < 4), o Greater than or equal to 50% (>_ 50%) of constituent concentrations are non -detect (constituent concentration below the laboratory reporting limit) or, o A non -parametric distribution is selected but there is insufficient data coverage (typically when n = 4) to calculate an LCL95 with sufficient statistical confidence. May 5, 2020 email from Rick Bolich, NCDEQ, to Scott Davies, Duke Energy, titled LCL95 Use. Page: 1/6 MEMO The constituents evaluated consist of metals and other inorganic constituents. The evaluation approach presented herein was performed in R (programming language) with the EnviSci package developed by Arcadis U.S., Inc. (Arcadis). Other software options for calculating these values, particularly LCL95s, are limited. The USEPA-distributed ProUCL software does not directly calculate LCL95 values, nor do other standard statistical software programs. While these compliance values could be calculated manually, the volume of data in this analysis precludes a manual approach. The following steps outline the approach followed in calculating compliance values for constituent concentrations in groundwater. First, the groundwater constituent concentration data were pre-processed to facilitate the calculations. 1) For calculating LCL95 values, the data were filtered to include only samples collected on or after January 1, 2015 (the start of the CAMA monitoring program; Attachment A-1). For calculating CTVs, the data were filtered to include only samples collected on or after January 1, 2019 (recent samples; Attachment A-2). 2) The data were filtered to include only wells of interest if a subset of wells was being evaluated, otherwise all site wells were included in the evaluation. For Sutton, all wells were considered in the evaluation. 3) Results with "RO" or ">" laboratory qualifier flags were removed, as these data are not usable. 4) Results with "<", "U", or "ND" laboratory qualifier flags were marked as non -detects. 5) All other samples were considered detected concentrations (detects), including values with a "J" laboratory qualifier flag (constituent detected at a concentration below the laboratory reporting limit). 6) Non -detect values with reporting limits greater than the largest detected concentration for that well/analyte pair were considered elevated non -detect values and were excluded from the data 2. 7) Records with turbidity >_ 10 nephelometric turbidity units (NTU) were removed. 8) For records with pH >_ 10 standard units (S.U.), all analytical results were removed except for boron. Wells with pH >_ 10 S.U. have likely been adversely affected by well grout. 9) Duplicate sample results for a given monitoring well and date were removed. The parent sample was retained .3 10) Negative values for radium or uranium were set to a positive value less than the smallest positive reported value. After the data were preprocessed, the LCL95s and CTVs were calculated for each COI at each well, as described in detail below. If there was sufficient data to calculate an LCL95 concentration, this LCL95 concentration was used as the compliance value, and the CTV was discarded. However, if an LCL95 could not be calculated due to insufficient sample size (n<4), too many non -detects (>_50%), or insufficient 2 Removing elevated non -detects primarily removes values with an unusually or unacceptably high reporting limit due to imprecise laboratory methods. However, it can also remove "acceptable" non -detects when all detected values are J-flagged (constituent detected at a concentration between the laboratory detection limit and the reporting limit). Due to the different date ranges used for LCL95 and CTV calculations, in some cases, some non -detects would have been considered elevated for LCL95 calculations but not CTV calculations or vice versa. 3 Parent and duplicate samples were determined based on a ranked order of sampling events. The selected order was CAMA, IMP, CCR, NPDES, and then any other miscellaneous sampling events. The highest ranked sampling event for a given well and date was designated the parent sample, and any other sampling events were designated duplicates. arcadis.com Page: 2/6 MEMO non -parametric coverage, the CTV was used as the compliance value instead. In some circumstances, such as wells with only one to three pre-2019 sampling events, neither could be calculated, and so no compliance value is reported. The LCL95 or CTV for each well/analyte pair, whichever is applicable, is presented in Attachment A-3. The following sections present methods followed for calculating LCL95 and CTV constituent concentrations. LCL95 CALCULATION LCL95s were calculated following EPA (2009) Unified Guidance. This process is complex, and enumerating every step is beyond the scope of this document. For specific details on how certain statistical procedures are calculated, the reader is referred to the EPA Unified Guidance (2009). However, the outline of the process is provided here. Unless otherwise stated, all statistical tests used an alpha (a) of 0.05. LCL95s were calculated following four steps: 1. Any non -detects present in the data were evaluated to determine how they should be treated. 2. Data were evaluated for temporal trends. 3. Data were evaluated for an underlying probability distribution. 4. Based on the outcome of the first three steps, the appropriate LCL95 calculation was performed. Non -Detects and Sample Size Data were first checked for validity for calculating LCL95 concentrations based on the number and type of both detect and non -detect values, following EPA Unified Guidance (2009). Elevated non -detects were excluded as described above. After elevated non -detects were excluded, if a data set for a well/analyte pair contained >_50% non -detects, fewer than four total observations (n<4), or fewer than two unique values, that well/analyte combination was excluded from further LCL95 analysis and the CTV was used in place of the LCL95. When the data set for a well/analyte pair contained five (5) or fewer total observations, non -detects were substituted at half the reporting limit, as recommended by EPA (2009). For data sets with greater than five observations, a substitution method was used for replacing non -detect values. For parametric estimation methods on datasets without a trend, EPA 2009 recommends either Kaplan -Meier estimation or rROS imputation for substitution of non -detects. rROS imputation is a more accurate method, particularly when there is only one reporting limit for non -detects. Therefore, rROS imputation was used for substitution of non -detect values. Prior to implementing rROS imputation, data were transformed with the hypothesized data distribution. Then, non -detect values were imputed with a lower imputation limit of zero. Then, the data set was checked with goodness -of -fit tests, as described below. In other words, rROS imputations were always estimated based on the same data distribution used to estimate LCL95s. For trend analysis and for non -parametric estimations of LCL95 values that had no trend, non -detects were evaluated at the reporting limit (non -detects were limited to <_ 5% for parametric trend analysis). Trend Evaluation All data that met the minimum requirements for sample size and percentage of non -detects were evaluated for monontonic temporal trends. The trend -based LCL95 calculation has somewhat more arcadis.com Page: 3/6 MEMO stringent data requirements than non -trend LCL95 estimation, so additional data checks were performed. If there were fewer than eight data points, trend analysis is considered to not be reliable (EPA 2009). Therefore, for data sets with less than eight data point it was assumed that there is no trend. If 5% or fewer of the data points were non -detects, the data were evaluated for a trend using linear regression. A best -fit line was calculated for the data, and then linear regression model assumptions were checked. Normality of the residuals was assessed as described below, and homoskedasticity was evaluated with a non -constant error variance test, otherwise known as a Breusch-Pagan test. If the model failed these assumptions, the data were log transformed or box -cox transformed, as described below, and model assumptions were rechecked. If model assumptions were met for any of the above data distributions, the data were evaluated for a significant trend with standard linear regression. If no significant trend was present, it was assumed that the constituent concentrations over time are stable and non -trend analyses were performed for calculating LCL95 instead. If a significant trend was present, the LCL95 was calculated based on a normal distribution around the trendline prediction at the most recent data point. If linear models did not meet model assumptions with any of the above data transformations or if the data contained more than 5% non -detects, trends were evaluated with the non -parametric Mann -Kendall trend test. The Mann -Kendall trend test is only viable when non -detects occur in the lower part of the data distribution, so if any non -detects had reporting limits greater than the median detected value, the data were assumed to have no trend and non -trend analyses were performed for calculating LCL95 instead. Similarly, if the Mann -Kendall test was valid but no significant trend was present, it was assumed that the constituent concentrations over time are stable and non -trend analyses were performed for calculating LCL95 instead. If a significant trend was present based on the Mann -Kendall trend test, the LCL95 was calculated by selecting the lower 95th percentile of bootstrapped Thiel-Sen predictions for the most recent data point, based on 500 bootstrap iterations. Assessing Probability Distributions All data for parametric tests —including regression residuals, data without non -detects, and data with rROS imputed non -detects —were checked for goodness of fit against a normal distribution. Where there were fewer than 50 observations, the Shapiro -Wilk test was used, and if there were 50 or more observations, the Shapiro-Francia test was used. Based on EPA (2009) recommendation, the alpha for these tests was set to a sliding scale based on sample size. If there were fewer than 10 observations, the alpha was 0.1; if there were between 10 and 19 observations, the alpha was 0.05; and if there were 20 or more observations, the alpha was 0.01. Additionally, the skewness of the data was evaluated to determine whether the skewness absolute value was less than 1. If the data met these normality assumptions (i.e. passed a goodness -of -fit test with an absolute skewness less than one), they were assumed to follow a normal distribution, and LCL95 calculations proceeded as such. However, if the data failed to meet these assumptions, the data were then transformed with a natural log transformation (non -detects were re -imputed using rROS, if present) and the tests above were repeated. If normality assumptions were met on the natural log -transformed data, the data were assumed to follow a lognormal distribution. If normality assumptions were not met on the natural log -transformed data, the data were then transformed with a box -cox power transformation with an optimized lambda between -2 and 2, and non -detects were re -imputed using rROS, if present. If the box -cox transformed data met normality assumptions, the data were assumed to have a power distribution. Otherwise, the data were assumed to be non -parametric. arcadis.com Page: 4/6 MEMO Wherever data transformations were used, LCL95 calculations were performed in the transformed units and then back -transformed into the original units. Thus, LCL95s calculated based on transformed data are confidence limits about the geometric mean or median of the data, rather than the arithmetic mean. Similarly, non -parametric LCL95s are confidence limits about the median. Calculating LCL95s If data were found to have a temporal trend, LCL95s were estimated around the most recent time point in the data based on either the Student's t-distribution (for parametric data) or bootstrapping (for non - parametric data), as described above. Similarly, if the data were found not to have a temporal trend or were disqualified from trend analysis, parametric LCL95s were calculated based on the Student's t- distribution for normal or transformed -normal data. Non -parametric analyses of non -trend data were calculated with a cumulative binomial distribution. When the data were found to be non -parametric, but the sample size was insufficient for 95% coverage (typically when n = 4), no LCL95 was calculated. Negative Values Due to the inherent mathematics of confidence interval calculations, LCL95s are occasionally estimated to be negative, particularly when 1) decreasing trends are present, 2) measured concentrations are relatively low, and/or 3) variability in the data is relatively high. This can happen even when no non -detects are present in the data. Although it is not actually possible for a sample to have a negative concentration of a constituent, this negative estimation is not an error; it simply means that the central tendency of the data (mean, geometric mean, or median) is not significantly different from zero. Thus, these negative LCLs were reported as calculated. For the purpose of mapping, negative LCL values will be posted as less than the reporting limit (<RL). CTV CALCULATION CTVs are calculated to capture well/analyte pairs with insufficient sample size (n<4) and/or too many non - detects (>_50%) to calculate an LCL95. The EPA (2009) Unified Guidance provides recommendations for testing statistical distributions and substitution of non -detect values. These EPA (2009) statistical distribution testing and non -detect value substitution methods do not readily apply to calculation of CTVs as they are intended primarily for large, robust datasets. CTVs are calculated on small data sets (n<4) or data sets with >_50% non -detects. Based on these CTV data set limitations, non -detect values were substituted at half their respective reporting limits, as is recommended by Unified Guidance (EPA 2009) for small sample sizes. CTV calculations do not account for historic trends in the constituent concentrations. Therefore, recent data (samples collected on or after January 1, 2019) were used to best represent current constituent concentrations in groundwater. If the range of detected constituent concentrations was greater than one order of magnitude, the geometric mean was calculated and used as the CTV concentration. If the range of detected constituent concentrations was less than one order of magnitude, the arithmetic mean was calculated and used as the CTV concentration. SUMMARY In summary, selected COI concentrations are provided in Attachment A-3. arcadis.com Page: 5/6 MEMO ATTACHMENTS Attachment A-1 LCL95 Dataset with Samples collected on or after January 1, 2015 to Present Attachment A-2 CTV Dataset with Samples collected on or after January 1, 2019 to Present Attachment A-3 Selected COI Concentration for Well/Analyte Pairs REFERENCES EPA. 2009. Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities — Unified Guidance. March. arcadis.com Page: 6/6 ATTACHMENT A (PROVIDED ELECTRONICALLY) A-1 LCL95 Dataset with Samples Collected on or after January 1, 2015 to Present A-2 CTV Dataset with Samples collected on or after January 1, 2019 to Present A-3 Selected COI Concentration for Well/Analyte Pairs