I began with 24 variables for each of the 4 veg types. Since we have maps of both mean and std dev, this is 48 maps. With all 4 veg types, that is 192 variables. Then I realized that some of the driver variables are exactly the same for all four veg types. I identified all of the variables (mean and stddev) that were exactly the same over all 4 veg types, and dropped all but one copy of them. Ppt, tmax, tmin, tday, tnight, vpd swavgfd, and par are the same for all veg types. This got us down to 8 x 2 = 16 x 4 = 64, 64 - 16 = 48, 192 - 48 = 144 variables. Then I found that dday, srad, and rhum are also the same for all veg types. Dropping all but one copy of mn and std for these resulted in 3 x 2 = 6 x 4 = 24, 24 - 6 = 18, 144 - 18 = 126 variables. Then, I identified all of the variables (mean and stddev) that were exactly the same over more than one veg type, and dropped all but one copy of them. Par turned out to be the same as sfwd, so 2 more were dropped, (124). Potential evapotranspiration mean and stddev were identical for C3 and C4 grasses, but different for the two tree types, so I dropped 2 more (122) and retained a pet-grasses variable. Coarse woody debris mean and stdev were all zeros for C3 and C4 grasses, so I dropped both of them - 122 - (2 x 2) = 118 variables remaining. So the elimination of exact duplicates reduced the number of variables from 192 to 118 variables. Considering that each of these represents a 1 sq km map of the US with 7.8M cells, we wasted a lot of disk space computing and storing multiple copies of all of these. Even after dropping all of the exact duplicate variable sets, there was still so much multicollinearity among the remaining variables that SAS retained only 113 positive pca factors from 118 variables. I considered doing a simple pairwise Pearson correlation-based elimination, but that would be subjective in terms of the selection of which of the paired variables to eliminate, and would ignore multi-collinearity across more than two variables. But I couldn't figure out how to use the PCA factor loadings themselves as an objective basis for reducing variables to a parsimonious subset. I fooled SAS proc reg into doing a multicollinearity diagnostic by setting up a "dummy" linear regression model, which I ignored. I did this multicollinearity analysis in two steps: first for the common variables and each of the four veg types separately (5 analyses), and then once with all remaining variables together, in order to detect multicollinearity across the common and veg type sets. From the first set of separate multicollinearity analyses, I dropped daytime temp (correlated w max temp), and night temp (corr w min temp). From C3 grasses, I dropped gpp (corr w nppmn, nppsd, pLAImn) From C4 grasses, I dropped gppmn, gppsd (corr w nppmn, nppsd, pLAImn). From decid broadleaf, I dropped litter carbon mn and sd (corr w veg carbon mn), and I dropped gpp mn and sd, and pLAI mn and sd (corr w nppsd). From evergreen needleleaf, I dropped litter carbon mn and sd (corr w coarse woody debris carbon mn), I dropped pLAI mn and sd (corr w npp mn). A total of 18 variables were dropped in the first phase of multicollinearity analysis, taking us from 118 down to 100 variables. I tried to drop variables symmetrically, removing the mean and std dev pairwise. I also tried to be symmetric across veg types, but this was not always possible. For example, evergreen needleleaf gpp values were not cross-correlated, but gpp values from the other three veg types were, and were removed. In the second round of multicollinearity diagnostics, all 100 remaining variables were included in a single analysis to check for excessive correlations across the common variables and the four veg types. I removed C4 days since rain mn and sd (corr w C3 days since rain mn and sd, respectively). I also removed decid broadleaf leaf water potential (psi) mn and sd (corr w C3 psi mn and sd, respectively). Other cross-correlations still existed, but were not as severe as these. This brought the number of variables down from 100 to 96 variables remaining in the parsimonius set.