Filling Gaps in the Measurement Records

Because the ARM program records actual measurements rather than estimated values, rare circumstances unavoidably arise when instrument or infrastructural failures create gaps in the temporal stream of measurements. Most gaps are short in duration and affect only one or a few related parameters. However, some failures, such as wide-area power outages or ice storms, occasionally affect nearly all recorded parameters at one or more ARM facilities simultaneously. One phase of this project seeks to statistically characterize the nature of the data gaps in various ARM parameters.

Even more rarely, the instruments and sensors may themselves become unstable or defective. Each ARM data file contains in its header some Quality Control information, usually provided by the instrument mentor, which establishes maxima and minima for the range limits of reasonable measurements. We apply these criteria, filtering out measurements exceeding the QC limits when preparing our customized statistical summaries generated for carbon modeling. Sensors may produce spurious measurement values outside the QC limits immediately prior to a complete failure mode. Such value Quality Control "trimming" is important for usability, but creates new data gaps and makes existing data gaps even larger.

Potential gap filling methods vary widely in terms of sophistication and complexity. Effort invested in filling gaps must be justified by evaluating whether carbon models are sensitive to or benefit from marginal improvements provided by more involved gap imputation methods.

Do various gap-filling methods make a difference?

The question above can be recognized as a variant of the one which we designed our "Make-a-Difference" (MAD) framework to answer. Essentially a customized analysis to test carbon model sensitivity, the MAD framework can also be used to evaluate and justify the necessity of using increasingly complex gap-filling procedures. When a MAD framework evaluation no longer demonstrates a difference, there is no need to invest additional intelligence or energy in imputation methods.

Of course, this sensitivity will vary with the selection of particular parameters, carbon models, output estimates, and imputation methods. Nevertheless, the existing MAD framework provides a "yardstick" which can be used for the evaluation of each specific combination of parameter, model, model prediction, and gap-filling method.

Imputation Methods

Imputation of missing values can be based on neighboring values through the dimension of time, the dimension of space, or a combination of both. Information from values which are present at the same site, but at other times can be used in making the estimate, or information from values which are present at other sites can be used. We also distinguish univariate methods, in which information is used only from the same parameter as is missing and being estimated, versus multivariate methods, which use information from parameters other than (or in addition to) the parameter which is missing. Because of their simplicity, our initial imputation approaches are univariate. We will build in complexity toward multivariate approaches as necessary and justified by the MAD framework tests.

This project is not the first to be faced with the problem of filling holes in a stream of measurements. The Ameriflux network of eddy-flux towers records measurements which are used in modeling and cannot contain measurement gaps. Eddy-flux correlation estimates are themselves based on a statistical model which, in turn, relies on a set of ancillary measurements. Missing data in primary or ancillary measurements prevent flux estimates from being made or used. The Ameriflux sites have developed an elaborate approach to gap-filling to address this problem. Their methods, however, are too specialized to be of general use here.

A Generic Approach to Imputation

We are developing a generalized univariate Univariate Generic Imputer Tool (UGIT) which is capable of automatically filling gaps in measurements by selecting the best method among a suite of standard statistical models based on time substitution, space substitution, or a combination of both time and space substitution. This tool, written in SAS, uses temporally lagged measurements from the previous day, week, and year, as well as the best available alternative site, and the average at all available sites to build a regression model for estimating missing values. Combination of the best available alternative site and the previous day is also considered. Whichever model produces the lowest RMSE among this suite of regression approaches is actually used to estimate the missing values.

The UGIT tool first assembles all available daily or hourly summarized data from all ARM facilities and Oklahoma MESONET sites. Day, week and year time-lagged values are calculated for each parameter at each site. The single best alternative spatial predictor site is also calculated for each parameter at each site. Then the tool calculates and constructs a data set of each time-alone, space-alone, and time-space regression model, including the R2 value and RMSE. Finally, UGIT uses the regression table to select the best model, patches the gaps, and adds a flag to the data set indicating which model was actually used to estimate each imputed data value. The regression choice depends not only on the performance of the model, but on the presence of the data values required for each type of regression. Thus, if a gap resulted from a wide-area outage that lasted for two days, values from the best alternative site or from the previous day may not be available, precluding the use of those types of regressions for filling that data gap.

We used the UGIT tool to fill gaps for several parameters at all ARM SGP facilities and Oklahoma Mesonet sites. Shortwave radiation was measured at the greatest number of sites (53), while longwave radiation was measured at the fewest (21). Most parameters were measured at 46 sites. The UGIT imputation tool produced as output a continuous record of all variables imputed at all sites.

The table below shows the relative counts of which models proved best and were actually used for imputing missing values in hourly summaries of the listed parameters:

Number of hours imputed
Variable Best site + Prev day Best site Previous day Previous week Previous year All sites avg
precip 2,267 41,756 605 2,265 1,676 2,476
temp 568 35,084 2,756 983 336 12,132
vpd 628 43,546 3,880 982 0 23,449
windspd 802 34,974 1,464 0 2,800 10,907
swrad 31,756 84,182 7,999 18 0 16,686
lwrad 223,471 295,249 2,032 1,163 62,287 4,097

The following two tables show the strength of the predictive relationship using each of the standard types of regressions at a "typical" MESONET site and a "typical" ARM facility:

R2 and (RMSE) of relationship for a "typical" Oklahoma MESONET Site (BLAS)
Variable Best site + Prev day Best site Previous day Previous week Previous year All sites avg
precip 0.27 (0.09) 0.27 (0.09) 0.00 (0.14) 0.00 (0.14) 0.00 (0.15) 0.00 (0.14)
temp 0.99 (1.0) 0.99 (1.0) 0.83 (4.6) 0.62 (6.9) 0.59 (7.0) 0.58 (7.2)
vpd 0.96 (0.17) 0.96 (0.20) 0.72 (0.48) 0.48 (0.65) 0.36 (0.72) 0.37 (0.72)
windspd 0.79 (1.1) 0.78 (1.1) 0.08 (2.3) 0.02 (2.4) 0.02 (2.3) 0.03 (2.4)
swrad 0.97 (41.2) 0.97 (41.8) 0.75 (132.6) 0.57 (153.7) 0.66 (152.3) 0.74 (136.8)
lwrad 0.84 (19.2) 0.83 (19.8) 0.59 (39.8) 0.43 (47.1) 0.37 (43.3) 0.40 (48.1)

R2 and (RMSE) of relationship for a "typical" ARM facility (E13 Central Facility)
Variable Best site + Prev day Best site Previous day Previous week Previous year All sites avg
precip 0.39 (0.07) 0.40 (0.07) 0.00 (0.10) 0.00 (0.10) 0.00 (0.10) 0.00 (0.10)
temp 0.99 (1.0) 0.99 (1.0) 0.84 (4.54) 0.62 (6.88) 0.59 (7.10) 0.58 (7.24)
vpd 0.96 (0.18) 0.96 (0.18) 0.71 (0.49) 0.46 (0.67) 0.34 (0.75) 0.36 (0.73)
windspd 0.81 (1.17) 0.81 (1.17) 0.06 (2.63) 0.01 (2.71) 0.01 (2.64) 0.02 (2.70)
swrad 0.74 (48.2) 0.66 (55.1) 0.51 (68.5) 0.45 (72.9) 0.42 (75.2) 0.57 (64.6)
lwrad 0.91 (13.9) 0.90 (13.9) 0.62 (40.5) 0.47 (47.8) 0.49 (45.5) 0.45 (48.5)

Parameters differed widely in their predictability. Temperature was easiest to predict; all models did a reasonable job predicting this temporally and spatially correlated variable. Vapor pressure deficit was also relatively easy to predict. Precipitation was an imputation challenge, ostensibly because it is highly variable, both temporally and spatially. Windspeed was intermediate between these extremes.

The spatial location of the single best other site was usually the best predictor (i.e., lowest RMSE) for most parameters. Longwave radiation is not recorded at the MESONET sites, so the number of hours that must be imputed is much larger for this parameter. The two radiation measurements showed more use of the hybrid space/time model. The previous year model was of some utility in predicting longwave radiation.

The "Hole-Punching" Experiment

We are planning to perform an experiment to test the effectiveness of data imputation methods with respect to output from selected carbon models. We will start with a complete, unbroken data record of a number of measured ARM parameters. From this complete historical record, we will deliberately remove measurements for certain periods of time, creating artificial gaps of various types. We will retain the complete record of actual measurements with which we started for comparison.

There will be four "factors" in the experimental design of the "Hole-Punching" experiment:

We will compare the original, unbroken data record with the interrupted and gap-filled data stream using our MAD framework, relative to the output of several carbon models. We will likely have to restrict the investigated levels of each of these factors to prevent the size of the Hole-Punching experiment from becoming unwieldy.

After a thorough examination of univariate methods, we plan to undertake (if warranted for particular "difficult" parameters like precipitation, according to the MAD framework evaluation) an examination of multivariate imputation methods. One of the strengths of the ARM data set is that so many different parameters are measured. By utilizing information present in these multiple cotemporaneous parameters, we should be able to improve substantially on the accuracy of estimated missing data values. Multivariate methods, however, will be substantially more complex than univariate approaches.

Spatial interpolation of measured ARM values across areas can be viewed as a special form of imputation. Experience gained in the gap-filling portion of this project should provide insights into the eventual extrapolation of point measurements taken at the spatial constellation of ARM facilites into fully-populated continuous grids of estimated values suitable for driving carbon models operating in regional grid mode.




William W. Hargrove (hnw@fire.esd.ornl.gov)
Last Modified: Fri Sep 20 14:46:25 EDT 2002