Statistical Discrimination of Kinds of Surface Materials
By their Digital Photographic Signatures
Jeff Barron, M.A.
Center for Advanced Spatial Technologies
University of Arkansas
Fayetteville, Arkansas 72701
Christopher Carr, Professor
Department of Anthropology
Arizona State University
Tempe, AZ 85287-2402
Of the ten artifacts chosen for use in this analysis, seven of the artifacts are breastplates, two are celts, and one is a headplate. Images of each of the ten artifacts, as well as general descriptions of the materials that were identified by Wymer (Chapter 4) are listed in Appendix 8.1. The coordinates for the material types on each of the artifacts provide the locations for the training site selections used to build the database of spectral signatures for the various materials in order to classify them by predictive modeling.
The artifacts studied here were analyzed for their mineralogy as described by Carr (2001):
“Microsamples of 11 differently colored surface materials – 10 thought to be mineral pigments and one organic binder or adhesive – were removed from 63 locations on 11 copper plaques [B-series], headplates [H-series], and celts [C- series] from four different Ohio Hopewell archaeological sites (depositional and taphonomic environments): Hopewell, Seip, Ater, and Fortney. The samples were taken from areas that are integral parts of likely human or animal images or their contrasting backgrounds, and that appear unnaturally homogenous in color (Carr 2001).
The inorganic surface materials on each artifact were identified using five different testing methods: electron microprobe analysis using energy dispersive detection, Raman microspectroscopy, x-ray diffraction, SEM microphotography, and petrological description. The materials used were found to fall into two classifications.
Of the inorganic surface materials, one group of surface materials was composed of noncopper compounds that seemed to function as pigments, and was not the result of a corrosive reaction with the copper. “Three noncopper compounds are red, yellow, white, and brown-black in color—the same colors used in other Ohio Hopewell artwork, the colors of the soils used in contrasting distributions to build some Ohio mounds and earthworks, and the colors found in much historic Woodland Native American art and ceremony” (Carr 2001). The second group of surface materials is probably the product of an intentional corrosive reaction with the copper, called patination. The copper-based compounds are red, and shades of blue from aqua to blue-green and turquoise to deeper dark blues (Carr 2001).
Both the copper and the noncopper inorganic materials, as well as the organic materials, that had their spectral signatures characterized in this analysis are listed in Table 8.2.
Univariate Statistical Analysis
The Application of Descriptive Statistics to Predictive Classification
Estimates of central tendency provide a method for identifying each material’s unique spectral characteristics. The degree of overlap in the spectral distributions, as a function of their medians and dispersion directly affect the predictive classification outcome.
Discriminating Power and the Distribution of Values
Figure 8.1 shows the distribution of brightness values for all materials grouped by each of the spectral bands (one boxplot per band). The range of values recorded by a band determines its discriminating power, which is defined as the number of bands (N) raised to the power of the number of levels recorded (R), or NR . Each 8-bit file contains a range of 28 (256) values, so the maximum discriminating power of a database of five spectral bands is 5256. As shown in the plots, many spectral ranges do not utilize all 256 values. The actual discriminating power of the data recorded is approximately 5238. While this loss of discriminating power is only slight, the results illustrate the problem posed for accurate predictive classification by the compression of multiple samples into smaller ranges. Decreases in the discriminating power of the spectral bands lead to corresponding decreases in the predictive classification accuracy of the database.
Figure 8.1 shows the distribution of the raw brightness values in each portion of the spectrum recorded. The visible bands—blue, green, and red—possess similar distributions, with slight variations in the ranges of extreme values. The near-infrared and mid-infrared comprise much smaller distributions; however, larger numbers of extreme values are also present in both these layers. The large number of outlier values could indicate errors in selection accuracy, positional error, or a heterogeneously textured material.
Measurement of Central Tendency
A fundamental issue of predictive classification concerns accurate representation of a material’s spectral signature. Both the central tendency and the dispersion of a material’s spectral signature are critical to its discern ability.
The median value is used here to evaluate the central tendency rather than the mean, because the median is more robust outliers possess only a minimal effect on the median. The median and standard deviations of the fifty-two materials are reported in Appendix 8.3. Appendix 8.4 shows number line plots by quarters for the ranges of medians and standard deviations, respectively. Each material’s median depends on its pattern and texture, and the quality of the training site selections.
Example Comparisons of Median Spectral Signatures Considering Individual Bands. The line plots in Appendix 8.4 provide an initial sense of which surface materials are more or less distinct from each other for particular bands of the spectral signatures, without considering the effect of the spectral variance of the materials. Each line plot is a primitive predictive model of the distinguishability of materials through color and infrared digital photography. For example, in the Blue band, blood-red colored cuprite (material HH) is at one extreme, with a median brightness value of 24, and white serpentine (material ZZ) is at the other extreme, with a median brightness value of 255. These two materials are very discernable in their digital photographic spectra. In contrast, dark brown hide (material FF) with a median brightness value of 89 is very similar on the Blue channel to a yellow pigment with some malachite copper corrosion admixture (material PP) with a median brightness value of 90. In the Red band, however, brown hide and yellow pigment with some malachite admixture are more distinguishable, having median brightness values of 106 and 161, respectively.
Example Comparisons of Median Spectral Signatures Considering All Five Bands. The breastplate artifact referenced as B013b provides five classes of materials. The five materials identified consist of two unrelated materials, charcoal and turquoise, and three similar materials, the outside surface of cremated, smoked bone, the outside surface of white cremated bone, and the inside surface of cremated bone.
Calculating the difference between the spectral layers (blue minus green, green minus red, red minus near-infrared, near-infrared minus mid-infrared) for each sample’s median scores provides a coarse vector represent a spectral signature’s unique rate of change through the measurement space. Table 8.2 identifies the rates of change for each of the materials from artifact B013b. Charcoal(A) gives change of [3, -4, 15,5], while Turquoise(G) yields [-15, -51, 27, 85]. These two rates of change show how charcoal’s spectral response is smaller than the response of turquoise.
The three bone materials present more uniform responses. Bone Cremation Smoked Outside (WW) changes [23, -3, 36, 47], Bone Cremation Inside (XX) changes [38,6,76,-23], and Bone Cremation White Outside (YY) changes [28,-2,61,-13].
The 95% confidence intervals for the five materials are presented in Figure 8.2. The medians and an interval of ± two standard deviations shows the range of potential values for each material. Two of the bone materials (XX) and (YY) possess very similar spectral signatures as shown both by similar vectors in Table 8.2 and by their close approximation in Figure 8.2. This common behavior between similar materials illustrates the consistency of the spectral signatures.
Measurement of Dispersion
A second important issue in predictive classification derives from the detrimental effects of increasing range size on the accuracy of the classifications. The primary dispersion measure calculated here was the standard deviation as an expression of the variability of each class of materials. Large standard deviations increase the probability that two materials will exist simultaneously in a single region of the measurement space. Obversely, a material class with a small standard deviation value generates a very narrow distribution, with less likelihood of intersecting an adjacent class. The size of a material’s standard deviation depends on its pattern and texture. A material with a heterogeneous, high contrast texture will possess a larger standard deviation than a homogenous material with little contrast.
The standard deviations were calculated for each class of materials for all five original spectral layers (Appendices 3, 4). Overall, the standard deviations appeared to remain consistent across the measurement space for any given class of material. The standard deviations calculated in each band for all materials are listed in Table 8.2.
The standard deviation in each band fluctuated from 9.8 to 11.8. The amount of dispersion was largest in the near-infrared and smallest in the blue band. For each spectral layer, the second quarter of each distribution contains the largest number of materials. In this quarter, discrimination accuracy will be limited, due to the occurrence of such a large numbers of materials within the quarter; many of their distributions will overlap in the measurement space. Greater amounts of overlap will produce a corresponding reduction in classifier accuracy.
Areas where different distributions overlap one another are especially critical. Overlap of multiple distributions will reduce the discriminating power of the training database for predictive application. Overlap in several bands where a single value could indicate the spectral signature for any of several overlapping distributions decreases the probability of correct assignment of an unknown to a class.
Example Comparisons of Spectral Signatures Considering both Median Spectral Response and Variance in Spectral Response. In the above example, the distributions of charcoal (A) and turquoise (G) converge in the red spectral band. When discriminating between these two materials, the red spectral layer will provide the least useful information regarding identification of unique spectral signature characteristics.
Considering all five spectral bands, the three similar materials comprised of bone all present similar behaviors in central tendency and dispersion. The other two materials, Turquoise (G) displayed in light blue, and the Charcoal (A) shown in dark blue, exhibit different responses than the bone materials. Thus, the charcoal and turquoise materials can be discriminated from the bone materials with a fair degree of accuracy due to the minimum of convergence between the distributions. The three bone materials exhibit a large degree of overlap. Because their spectral signatures are indistinct, the classification of bone into separate classes would be very inaccurate.
In sum, when the materials were mapped individually according to the median values describing their central tendencies, the intersection points between the various materials indicated where the overlap of individual sample distributions created areas of additional classification accuracy. The more important consideration in terms of classification accuracy was that the rate of change, slope, of each different material’s spectral signatures, which are unique. Although the central tendencies and dispersion provided the starting point for the classification process, the rates of change became the critical predictors that distinguished between similar materials.
Multivariate Statistical Analysis
The Application of Multivariate Statistics to Predictive Classification
Multivariate statistics compress the measurement space from a large number of variables to a smaller number of variables. This type of reduction has proven useful in other types of archaeological applications, “…because it is a way of disentangling complex patterns of variation which are not otherwise easily assimilated” (Sheenan 1998). The data matrix of dependent and independent variables provides the variance information for comparing the variation in one dependent variable with the amount of covariation in other variables. These comparisons determine the degree of association between dependent and independent variables, derive functions to estimate dependent variables, and calculate the statistical confidence of the results (Green 1976). Since the primary goal of this research concerns the predictive classification of the spectral signatures from Hopewell copper artifacts, discriminant function analysis provided the optimal the multivariate statistical method for processing the information contained in the spectral database.
Discriminant Function Analysis
Discriminant function analysis predicts potential membership in classes defined by the dependent variables through a set of classification functions. By processing an unknown sample unit-by-unit, the decision model determines potential membership based on the classification function that provides the highest classifier score at each unit. “Discriminant analysis involves deriving the linear combinations of two (or more) independent variables that will discriminate best between a priori defined groups. This is achieved by the statistical decision rule of maximizing the between-group variance relative to the within- group variance—this relationship is expressed as the ratio of the between-group to within- group variance” (Hair 1998). The application of discriminant function analysis to predictive classification was based on the following assumptions:
- Distinct known classes exist within the sample;
- All the cases used as training data are correctly identified;
- The classes are a random sample;
- Each class’s attributes is normally distributed;
- Similar variance/covariances matrices exist for each group;
- Class a priori probabilities can be estimated;
- All classes that can exist are known.
M. J. Baxter (1994) explains the primary function of discriminant function analysis saying, “Discriminant analysis starts from the presumption that a set of objects are known to belong to one of two or more groups. Two aspects are commonly distinguished – that of discrimination, where new variables are defined that in some sense distinguish between known groups; and that of allocation or classification, where objects are assigned to existing groups on the basis of their characteristics.” The strength of the model developed from the training database determines the predictive accuracy of the discriminant function analysis.
Yet, due to the difficulty in distinguishing between outliers and other unique spectral traits in the materials, and the inherent skewness of the sample data discussed in chapter 4, the same raw data was standardized by dividing each value by the standard deviation of its respective class. Retention of complete samples is preferable since extreme values could result from either heterogeneous textures or anomalies in extracted samples.
The theoretical model for discriminant function analysis derives from Bayes’ rules of probability. Decision rules for classification require knowledge of the a posteriori probabilities that a pixel belongs to any of the training classes, i, with a feature vector f. Schowengerdt (1997) describes Bayes’ decision rules and the derivations used for discriminant function analysis as:
Predictive discriminant analysis derives sets of linear functions based on the spectral reflectance values recorded in the training database. The data extraction technique was explained earlier in this chapter. The class that receives the highest classifier score after evaluations of all the discriminant functions becomes the most likely material for that particular unit of the measurement space. Each different class in the training database generates a unique discriminant function. The discriminant function score (Dsc) for a material is defined as:
where di equals the discriminant function coefficient, and each coefficient is multiplied by its respective spectral layer. By definition, the discriminant function coefficients are chosen to maximize differences between groups. The sum of the means for a set of discriminant function coefficients is zero, and the standard deviation is equal to one. Each equation creates a hyperplane showing the potential class membership for each material tested. The intersections of the planes form boundaries between the classes in the band space.
The implementation of discriminant function analysis involves two stages. This first stage is significance testing. Significance testing compares variance and covariance information from the materials. Sufficient variance must exist to ensure the dependent variables can be segregated into distinct groups. The matrix of total variance/covariance is compared via a multivariate F test to the pooled within-group variance to test for significant difference between groups. Significant difference between the means allows accurate predictive classification of materials.
The second stage of discriminant function analysis is classification. This process derives a set of classification functions from the known classes of dependent variables. These classification functions comprise the decision-making process of the predictive analysis. Each coordinate of an unknown sample is analyzed using the set of classifier functions. For each cell in the matrix, the decision for potential membership is based on the highest classifier score for that cell.
Significance Testing of Covariance & Group Means
Discriminant function analysis assumes that the covariance between the independent variables will be homogenous for all the dependent classes. Box’s M tests the fifth assumption of homogenous covariance matrices. Box’s Miscalculated as follows.
If the p(M) < .05 then the variances differ significantly. Therefore, a significant Box’s M value is undesirable, since it requires rejecting the null hypothesis that the variance of the independent variables remain homogenous among the dependent classes. The probability of the F-statistic must be less than 0.05 to validate the assumption of homoscedascity, homogeneity of variances. The results of the Box’s M test are displayed in Table 8.5.
The results of the Box’s M test shown in Table 8.5 indicate that a significant amount differentiation exists among the covariance matrices for the classes of the dependent variable, and as a result the spectral signatures in the training database do not appear to follow the assumption of homoscedasticity. When analyzing large samples, even small deviations from homogeneity will be found significant by the Box’s M calculation.
Calculation of the log of the determinant matrix for each of the dependent classes allows the degree of individual dependent class covariance to be evaluated. In Table 8.6, the rank column identifies the number of independent variables. In each of the 52 classes, the number of independent variables remains constant at five–the five spectral bands. The pooled within-groups log determinant is -2.5331. Classes that deviate broadly from this value possess larger amounts of covariance. The largest log value is the bone cremation smoked outside (WW); thus, this material’s possess variance least likely to conform to the homogeneity assumption. The two other bone materials exhibit similar behavior, possessing log determinants more than twice the size of the pooled value. The clay, yellow, malachite, and hide materials also have values more than twice the average value. When the log determinants are ranked by value, twelve of the values are less than the pooled value, and forty of the materials are greater than the pooled value.
To determine if the groups’ means were unique, Wilks’ Lambda test, also called the maximum likelihood criterion, was employed as a significance test. According to Hair (1998), Wilks’ Lambda ranges from minus one to plus one. Results closer to zero indicate high potential for discrimination between groups, whereas values closer to one indicate group means that are identical. The Wilks’ Lambda statistic can be converted to an F- statistic, allowing the calculation of significance. Low significance for the F-statistic indicates strong difference in the group means.
As shown in Table 8.7, each Lambda statistics results in very low values. The small Lambda values indicate that strong differences exist between means of different materials in the data set, and this conclusion is additionally supported by the low significance values of the Chi-square-statistic. The strong differences between the fifty-two materials analyzed seem to support the assumption for unique central tendencies, and the existence of heterogeneous means that are required for discriminant classification functions to accurately represent the dependent variables.
The significance tests automatically determine the optimum combination of variables in order to maximize discrimination between groups. Orthogonal functions that remain independent of one another are used; thus, each function contributes unique information to predictive classification. The successive functions are determined by a canonical correlation analysis.
Used primarily as a descriptive measure, the canonical correlation shows the strength of agreement between the discriminant scores and the transformed layers. The canonical correlation measures the proportion of the total variability explained by the differences between groups. Each function defined using the canonical correlation incorporates the largest remaining amount of the unexplained variance. Thus, each subsequent function will account for less variance in the measurement space than the preceding function.
The eigenvalues were calculated as the between-groups sum of squares divided by the within-groups sum of squares. The largest eigenvalue indicates the maximum dispersion for the first function as an eigenvector direction. Each subsequent function’s eigenvalue is accompanied by a corresponding eigenvector. The square root of the eigenvalue determines the length of the eigenvector; thus dimensions with very small eigenvalues encompass only small amounts of the original variance, and do not account for any significant amount of dispersion between classes of materials.
The five raw spectral bands, or layers, comprised the independent variables. The materials where more than one sample was extracted during the training process were pooled into a single class, and as a result, the fifty-two materials were condensed into thirty- seven classes. The eigenvalues, cumulative percentage of variance, and the canonical correlations for the analysis are listed in Table 8.8.
As Table 8.8 shows, the first four functions possess a cumulative percentage of 99.3, and lowest canonical correlation with the original data layers was .806. These results indicate a fairly strong agreement between the original and transformed measurement space. Additionally, even functions with relatively low eigenvalues possess strong canonical correlations between layers and the discriminant scores. The strong correlations from the first four functions indicate that these functions strongly represent the relationships of the data in the spectral layers. The fifth function correlates less strongly to the original variables, but the contribution of this function to predictive classification is minimal since it accounts for less than one percent of the variance inherent in the measurement space.
Both the standardized coefficients and the structure matrix express the contribution of each spectral layer to the classification process. Larger coefficients indicate greater influence on discrimination by a particular layer. In the structure matrix, Pearson’s product moment correlation quantifies the degree of the relationship between two variables. Correlation coefficients measure the degree of agreement between two sets of scores. The coefficients range from minus one to plus one. A zero coefficient indicates no agreement, while a coefficient of one or negative one represents full agreement. A greater agreement between scores corresponds to a greater accuracy of the predictions (Kline 1994). The structure matrix in Table 8.9 shows the within-group correlations for each predictor variable in the canonical function. The strength of correlations was used to determine which spectral layers provided the most or least influence on a particular function. The strongest correlationforeachlayerismarkedwithanasterisk. Bycomparingtheresultsofeach column, the contribution each layer made to a particular function was evaluated.
As shown in the results from the structure matrix in Table 8.9, for function 4, the layers green, blue, red, and nir correlate more strongly with this function than any of the other, making it represent the visible color portion of the spectrum. Function 2 possesses the strongest absolute correlation with the mid-infrared layer, and also the strongest correlation among the entire group of materials. Functions 1, 3, and 5 possess no absolute correlations with any particular spectral bands, yet certain spectral layers–green, near- infrared, and near-infrared, respectively–still possess stronger relationships to these functions than any of the other layers.
Interpretation of Eigenvalues and Structure Matrix
The combination of the eigenvalues and the results of structure matrix allow the contribution of each spectral layer to the variance represented by each function to be calculated. Function 1 accounts for 72% of the variability in the brightness levels of the bands for all 52 materials. It most strongly and almost solely reflects the green band.
Function 2 expresses 21% of the variability in brightness levels – a still important contribution. It strongly and almost solely expresses midrange-infrared variation in brightness levels.
Function 3 represents only 3.4% of the total variance in brightness levels. It is correlated to moderate degrees to the near-infrared and red bands, and secondarily to the green band.
Function 4 expresses only 3.0% of the total variance of the brightness levels. It reflects all three visible bands (R, G, B) as well as the near-infrared band moderately strongly.
Function 5 accounts for .73% of the variability in brightness levels. It, again, correlates with a wide range of spectral bands (near-infrared, blue, red, and midrange infrared), but to only a moderate to weakly moderate degree. This residual dimension is not especially significant to the distinguishing of artistic compositions.
It is significant that the red and near-infrared bands correlated both substantially (> .37) with three of the five discriminant functions, whereas other bands correlated substantially with only two discriminant functions. This indicates the important contribution made by the adjacent red and near-infrared bands to the total brightness variation of the materials, and the great significance of these reddish bands to discerning the artistic imagery. This conclusion was also reached through the qualitative studies presented in Chapter 7.
Euclidean Distance as a Measure of Similarity
Euclidean geometry provides an effective method for comparison of spectral traits for the various classes of materials. The Euclidean distance coefficient is one of the most commonly implemented measures used to create an n x n matrix of distances between n objects (Sheenan 1998). Each of the five data layers was derived from an 8-bit image, and as a result, each layers possesses a possible range of 256 (28) values. The uniformity of the ranges in each data layer allows each one the potential to contribute equally to the distance measure. Combining the multiple spectral layers into a five dimensional measurement space creates a matrix where distances between materials are assessed using Euclidean distance transformations. By calculating the straight-line distance between pairs of materials in the five dimensions of the measurement space, the proximity of the various classes of materials was evaluated. Based on the Pythagorean Theorem, the Euclidean distance between two data points is calculated by computing the square root of the sum of the squares of the differences between corresponding values for the various pairs of materials (Sheenan 1998). The Euclidean distance is defined as follows:
The X and Y values used in the equation were the median values for each pair of materials where X is material 1 and Y is material 2, and the r, g, b, m subscripts are the set of spectral layers. The result of this set of calculations is a 52×52 matrix of distance values that indicate the straight line separation for all combinations of materials. Each row in the matrix represents a specific material and the columns of the matrix also contain the array of all materials. In this way, the pair-wise distance combinations are presented for the entire sample set. As the Euclidean distance value, D, decreases, the accuracy of classification between pair of materials shows a corresponding reduction. Similarly, increasing distance values, D, result in more accurate of classification between the pairs of materials. The entire matrix of Euclidean distances (52×52 = 2074) is presented in Appendix 8.5.
The lowest distance values in the matrix indicate materials where correct predictive classification would be problematic. Logically most of the comparisons with low Euclidean distance values are materials that possess similar physical or chemical properties, or where multiple samples of the same material are included in the training database. Additionally, since the matrix is 52 materials by 52 materials, each material was compared with itself. If two materials are identical, they possess no dissimilarity; thus, these Euclidean distances are always zero, indicating no separation between materials.
The lowest two percent (40 of 2,704) of the comparisons between different materials islistedinTable8.10. Asshowninthetablebelow,thematerialswiththelowestscoresare multiple samples of the same class of material or materials that are of a single, general kind and that possess similar physical and chemical properties. Similarities in color accounted for a significant portion of the minimal distances, since the three spectral layers that equate to a material’s color would account for sixty percent of the information used to calculate the Euclidean distance coefficient. However, the infrared layers still contribute a portion of the information regarding the separation of materials; consequently, criteria other than color also influence the separability results.
Out of the twenty comparisons representing the materials with the least likelihood for successful classification, ten of the comparisons are between same materials or materials with only subtle variations in color and chemistry.
In order to ascertain if the transformed measurement space is a more effective classifier than the untransformed raw data, the lowest two percent of the raw Euclidean distances from the separability index were compared with their corresponding distance in the transformed measurement space. Both types of distance scores were converted to standardized units using z-scores, and the two values were then compared to determine if the transformed layers provided an increased amount of separability. The results of these comparisons are shown in Table 8.11.
The transformed measurement space improves the results of the standardized Euclidean distance scores. By maximizing the amount of variance represented in each successive function, the canonical discriminant functions optimize the separability of the materials in the transformed measurement space. The increased separation between the distributions of materials reduces the degree of overlap between similar materials and consequently improves the predictive accuracy of the classifiers.
Discriminant Transformation Functions and Classification Scores
The discriminant functions convert the raw spectral data from the original measurement space to the optimized transformation space for all points in the distribution. The functions consist of five coefficients, one per spectral layer, and a numeric constant. “The traditional approach to interpreting discriminant functions involves examining the sign and magnitude of the standardized discriminant weight (sometimes referred to as a discriminant coefficient) assigned to each variable in computing the discriminant functions. Independent variables with relatively larger weights will contribute more of the discriminating function’s power than do variables with smaller weights” (Hair 1998). The classification results from a set of discriminant functions that assign class membership based on the contributions of the independent variables. Each different type of material in the training database generates a unique discriminant function based on its own within-group mean and variance. The transformation functions are listed in Appendix 8.6.
By calculating a classifier score for the entire set of derived transformation functions at each position in the training database, each location was evaluated for potential membership in all classes of materials. Tabulating the results of the classification scores for the known versus predicted materials indicates the classes of materials where incorrect identifications were most probable. Each function was processed individually; the spectral bands were multiplied by a material’s function coefficients, and the five products of these multiplications were summed with the constant value. The function that generated the highest return value, called a classification score, indicates the most probable group membership.
Discriminant Function Classification and Accuracy Assessment
Classification accuracy is a critical feature of predictive analysis. Without accuracy greater than chance, the outcome of a random distribution, there is no justification for the application of the discriminant functions to additional data sets. The correct classes of the values were known a priori to the classification by the functions, and the probability of correct classification by random assignment would be less than three percent (1/52 = 0.019). Assessment of function accuracy involves determining the number of incorrect versus correct responses. If each cell along the diagonal of the matrix were the total number of classifications for a given material and all other cells in the matrix were zero, the accuracy would be one hundred percent. However, most discriminant functions did not classify all the values in the data matrix with perfect accuracy; thus a lower the value on the diagonal indicated a reduction in the accuracy of a classifier function.
The tabulated classification matrix of materials predicted by the discriminant functions is listed in Appendix 8.7. Each value in the dataset was classified using the discriminant functions in order to determine the most likely class membership. In order to evaluate these classifications, a 52 x 52 matrix of the materials was constructed, where each row represents the training site data for a material, and the columns represent the predicted material based on the highest classification score from the set of discriminant functions. The diagonal of the matrix indicates the number of values that were trained from and predicted to the same group (correct classification). The overall percentage of correct classification using this set of discriminant functions to predict materials was 82.1 percent.
Certain cells in Appendix 8.7 are highlighted based on the results of the discriminant function classification using the thirty-seven pooled materials. The highlighted cells represent the classification accuracy of the discriminant functions. Cells highlighted in grey possess a classification accuracy of greater than fifty percent. The red cells represent thirty to fifty percent classification accuracy for the discriminant functions. These classifications are most problematic since they represent either poor classification accuracy, if it is a material’s comparison with itself, or they are comparisons between materials between materials that result in large amounts of misclassification.
Yellow shaded cells represent between one and fifteen percent correct classification. On the diagonal that would include a significant amount of misclassification of a material, but in other cells representing inter-material comparisons, less than fifteen percent error in classification could yield acceptable results for predictive analysis. Non-shaded cells indicate percentages of classification less than one percent, and misclassification error in the minimum range would possess the least amount of influence on the accuracy of the predictive model.
The purpose of accuracy assessment is to determine how efficiently the discriminant functionsrepresentthesampledata. Whiletheoverallpercentofcorrectclassificationis useful for determining the predictive strength of the entire set of classification functions, calculating the classifier accuracy for the individual functions is necessary in order to determine how well the individual functions were able to predict specific materials.
These classification accuracy results seem to indicate that the potential copper precipitate materials that were discussed earlier, such as malachite and psuedomalachite, will be problematic for predictive analysis, due to the inability of the function modeling to properly predict the correct material. The majority of the non-copper precipitates that were believed to be intentionally applied appear to possess enough diversity and unique attributes in their distribution to allow successful classification.
The materials extracted from the training sites on artifact B013b provided a good example of how the discriminant functions classified both correctly and incorrectly. As discussed earlier, this artifact provided three types of related bone materials, and two other non-similar materials, turquoise and charcoal. Examining how these five materials were classified by the discriminant functions showed the strengths and weaknesses of the functions for predictive classification. As shown in Table 8.13, the five materials were classified into sixteen different classes of materials by the discriminant functions. However, the majority of the erroneous classifications involved minute portions of the data, typically less than one percent of the total sample of a material.
The sample of charcoal was assigned to fourteen different categories in the matrix. The amount of the charcoal sample classified correctly was 77 percent, an amount slightly lower than the overall accuracy of the classification functions. Examination of the categories showed that most of the erroneous classifications were less than two percent of the total, and were generally materials that shared properties such as organic content, color, or texture with charcoal sample.
The largest erroneous classification of charcoal occurred with the azurite lighter and azurite darker discriminant functions. Over sixteen percent of the charcoal sample was erroneously classified to these two functions. Whether the incorrect classifications of the charcoal sample were the result of incorrect selections during the training site extraction or resulted from the carbonate component of the azurite new samples of charcoal would have had to have been required to be extracted and tested with the discriminant functions to see if the new samples exhibited a similar distribution among the predictive functions. As expected, the three classes of bone materials showed a significant amount of inter-group classification error. The greatest amount of error was in the material designated bone cremation smoked outside, which had a correct classification rate of only 57 percent. This material became distributed among twelve different functions, mostly in amounts less than two percent; however, twelve percent of the materials was classified erroneously as organic hide. Thirty-seven percent of this material was attributed to the other two classes of bone material.
The other outer bone material, bone cremation white outside, had a correct classification rate of seventy-one percent. The remaining twenty-nine percent of this material was attributed to both of the other bone materials by the discriminant classifiers. The erroneous classification of these groups was most likely due to the large degree of overlap between the spectral signatures as shown previously in Appendix 8.3. Correct discrimination among the three similar materials was reduced due the large amount of shared measurement space, and similarities among their unique spectral signatures.
The third type of bone material, bone cremation inside, possessed the greatest classifier accuracy of any of the bone groups. Eight-six percent of this material was classified correctly. Approximately twelve percent was incorrectly determined to be the other two classes of bone material, while the remaining two percent was attributed to the yellow light material erroneously. As shown in Appendix 8.3, the higher accuracy of classification for this material was most likely the result of its spectral signature being more divergent from the other two bone signatures.
Of the five materials from artifact B013b, turquoise presented the optimum classification results. The turquoise classifier function was able to categorize ninety-nine percent of the sample data correctly. The remaining one percent was divided between two other blue materials. Referring to the spectral signature listed in Appendix 8.3, the strong results for the turquoise classifier resulted from a combination of its unique distribution in the measurement space, and the strong influence of its unique hue since the color component contributed the most variance in the transformation functions.
Relationship between Separability of Materials and Classifier Error
By determining the amount of classification error among the materials that were determined to be the least separable in Table 8.10, the effects of the degree of separability on classification accuracy can be explored. As shown in Table 8.12, the separability between the materials was not equivalent to the amount of erroneous classification completed by the particular discriminating function. Additionally, the relationships between materials of a pair were not reciprocal, meaning that the same distance between the samples, A-G, would result in a different distance than G-A.
The distance separating malachite pine green from olive medium dark was approximately 2.67 units, and nearly seven percent of the malachite was categorized as the olive material. However, less than .10% of the olive material was mistakenly classified as the malachite. These differences in the classifier accuracy resulted from the differences in the amount of overlap between the distributions for the materials. Thus, it would seem a greater portion of the malachite pine green overlapped into distribution space of the olive material, and that the olive material had a smaller amount of shared measurement space with the Malachite Pine Green.
Additionally, the amount of separation between the materials did not necessarily indicate the degree of classification error. The separation between organic hide dark and olive light-medium was 2.69, and the incorrect classification ratio between the materials was 0 to 1.94. There was a smaller amount of separation between malachite pine green and olive light-medium. Yet, this separation of 2.4 still provided better classification results than the larger separation value, since the classification error ratio for these two materials was 0.235 to 0.457. Although the majority of the materials followed the trend of decreasing separability creating large amounts of classification error, there were enough exceptions to this trend to indicate that the separability index provided an effective tool for preliminary analysis of predictive classification, but it was inefficient at indicating the degree of classification error that could be expected.
Figure 8.3 compares the distributions of materials in three layers of the measurement space using both the raw reflectance values and the optimized transformation of the original values. The upper scatterplot, showing the raw reflectance, possess much broader overlapping distributions for the materials. The lower scatterplot displays the distribution of the materials in the optimized measurement space, and the dispersion of the materials, and the resulting overlap between adjacent materials has been reduced.
This figure shows the relationship between overlap of samples and the separability between the materials. olive medium(U), olive light medium (W), and olive light monocot (Z) possess a large amount of shared measurement space in the raw reflectance values; however, in the transformed layers, the materials exhibit more compact distributions and the separability between the materials increases. The transformation functions alter the distribution of the pine green malachite material (P) changes more drastically than the other materials.
In the upper scatterplot for the raw data, the malachite possesses a skewed distribution, but in the transformed measurement space, the malachite shows a more normal distribution of values, although the large amount of space between the values for this material still creates problems with classification. This reduced classification accuracy is evident in both the discriminant function classification results in Appendix 8.7 and the results from the Euclidean distance comparisons shown in Appendix 8.5.
Case Studies of Copper Artifacts
The derived discriminant classification functions were applied to assess potential membership in classes on both small sections of artifacts and on complete artifacts (Appendices 8.8 and 8.9). The number of spectral signatures processed at one time was limited to less that fifteen by the GIS software.
Breastplate B013b. Artifact B013b’s Turquoise sample was the first artifact sample tested for new materials not previously identified. The results of the predictive classification are shown for a 4×4 cm area of the artifact in Figure 8.4. During preliminary signature development, five spectral signatures were extracted from the sample. These materials were charcoal (A), turquoise (G), bone cremation smoked outside (WW), bone cremation inside, (XX), and bone smoked cremation outside (YY). The materials tested for possible inclusion include azurite darker (D), blue dark slate light (F), malachite pine green (M), pseudomalachite (S), olive medium (U), and yellow light (TT); none of this second set of materials was sampled from this artifact. The upper-right portion of Figure 8.4 shows the original RGB image of the sample area. Below this image is the legend showing the symbols, material named, number of cells classified, and the percentage of area classified. The large image on the left show the results of the predictive classification.
The Azurite Darker material disperses throughout the range of the sample area, typically along the margins of the charcoal, turquoise, malachite pine green materials. The blue dark slate light and turquoise samples where highly intermingled in the predictive classification. Malachite pine green was identified in nearly twenty-five percent of the sample area. Additionally, most of the cells assigned to this material were not previously identified during the extraction process. Charcoal was also identified in areas of the images where it was not indicated during the extraction process. The Olive Medium material showed concentrations within the charcoal material and along the perimeter of the bone and turquoise materials.
The bone materials possessed problematic classifications. The bone cremation smoked outside, and bone cremation inside were successfully identified within the regions where the original spectral signatures were extracted. However, the bone cremation white inside material was not identified anywhere in the sample area, even though this was the signature’s origin artifact. This seems to indicate that the similarity and amount of overlap in the spectral signatures prevented the classification scores from correctly discerning the correct material types (see Figure 8-2).
Breastplate B031a. Application of the predictive discriminant functions to this object are shown in Appendices 8.8 and 8.9. The gum (B) material was extracted only from artifact B031a Gum1. The extractions from this artifact are shown in Figure 8.5. During the classification process, this material was classified in nearly twenty-five percent of the sample area. Charcoal (A) was identified in less than two percent of the sample area. The sample area identified as charcoal was a cohesive unit, and appears to result from a graphite or black pen used to encode a catalog number on the artifact.
The gum (B) material is concentrated at the upper portion of the image, along the margins of the materials classified as chrysocolla medium (J) and olive medium (W) materials. Chrysocolla medium (J) and olive medium (W) occurred predominately in the central portion of the sample area. Several cohesive areas of olive medium (W) and serpentine white (ZZ) are found throughout the image.
Feather light medium (AA), a spectral signature extracted from another artifact, was predicted on nearly thirty-two percent of the sample area. Thus, it could be inferred that the gum (B) material was used to affix the feather light medium material to the artifact. Copper (LL) was the only material was not identified on any areas of the artifact, and the blue light (H) material was predicted in less than a fifteenth of a percent of the sample area. This would seem to indicate these were anomalous classification due to variation in the spectral data from other materials.
Celt C013a. Artifact C013a Cuprite, shown in Figure 8.6, provides very clear classification results for certain materials. Gum (B) occurs outside the artifact perimeter within the image area, and the results are definitely erroneous classifications. Chrysocolla medium (J) is located on the outer edge of the artifact, where intersects with both the clay med. brown (MM) and clay light and medium brown (OO).
Gum (B), olive light monocot (AA), and copper (LL) occur only minimally in the sample area. Cuprite dark pink (II) and cuprite light pink (JJ) are found in no portion of the image, although cuprite blood red is classified over almost sixty-eight percent of the image. Unlike the bone materials contained in Figure 8.4, the copper materials possess sufficient spectral separability to be successfully discerned. When the RGB image is compared with the predictive classification, the left portion of the RBG image appears very heterogeneous and would be composed of several different cuprites. However, the predictive classification image shows the classification as a single discrete unit, even when several types of cuprite were included in the predictive classification.
Summary and Conclusion of the Discriminant Function Analysis
The Model. Discriminant function analysis allows the predictive classification of materials based on scoring materials into their most probable categories. In order to maximize the classification accuracy, the dataset comprised of 52 materials and 5 spectral bands was transformed using a logarithmic function. The model for discriminant function analysis is based on Bayesian decision rules. Building the model involved determining the uniqueness of the groups of independent variables, calculating eigenvalues and eigenvectors to determine the amount of variance represented by each transformed function, and interpreting the structure matrix in order to determine how the original independent variables are represented by the transformed functions. The model’s Function 1, which accounts for 72% of the variability in brightness levels of the bands for all 52 materials, most strongly reflected the green band. Function 2, which accounted for a still substantial 21% of the variability in brightness levels, was comprised primarily of midrange-infrared variation. Functions 3, which contributed only 3.4% of the total variance, represented the red and near-infrared bands. Functions 4 and 5, which accounted for much less of the variance in brightness levels and are largely inconsequential, were correlated with four or more bands each. The red band was correlated substantially with four the five discriminant functions, documenting its importance to the spectral distinctions among materials.
The transformation functions used by the discriminant analysis tended to optimize the separability among materials, as was demonstrated with the list of the least separable materials identified through the use of the Euclidean distances. However, as the results of the user and producer confusion matrix showed, several of the materials that were most problematic for discrimination were materials that possessed larger amounts of separability. In this portion of the analysis, the severity of the overlap of the distribution spaces for the materials played a much more critical role in successful prediction than did the distance separating their central tendencies.
Future Studies with the Method of Discriminant Function Analysis. While the results of classification using the visible and infrared portion of the spectrum provided satisfactory results for basic predictions, increasing the portion of the spectrum sampled would allow identification of a larger portion of the spectral curve of a material. This would increase the chances of recording the portion of the spectrum where a material generates a unique spectral response, which would allow it to be accurately identified.
The predictive model would also benefit from the inclusion of a larger number of benchmark samples. Increasing the types of materials that can be identified by the discriminant model would allow the predictive functions to more closely estimate a material’s correct identity.
with the Image Enhancement Methods Presented in Chapter 7.
Specifically, the single spectral band line plots in Appendix 8.4 show which materials are separated more or less from each other for each of the five spectral bands, i.e., univariately. The Euclidean distance matrix in Appendix 8.5 documents which materials are separated more or less from each other considering all five spectral bands simultaneously, i.e., multivariately with equal weighting of all five bands. The classification matrix in Appendix 8.7, which resulted from the discriminant function analysis, records which materials are discriminable from each other more or less considering all five spectral bands simultaneously and optimizing for differences among materials, i.e., multivariately and with weighting of the five bands according to the degree to which they help to discriminate among materials. All materials could be discriminated from each other with correct identifications between 89.9% and 100%, and most materials with more than 95% accuracy (Table 8.12).
Statistically, a Box M test shows that, taken as a set, the covariance matrices of the materials are significantly different from each other. Wilks’ Lambda, and the Chi-square statistic derived from it, indicate that, taken together, the materials exhibit strong differences among themselves in their means.
The pairs of materials that are most likely to be confused for each other and that theoretically could interfere with the recognition of artistic compositions – 20 of 2,704 pairs – are shown in Table 8.10, based on simply the Euclidean distances between the materials. Most of these problematic cases pertain to materials that are slight variations of a more general class of materials, such as malachites of somewhat different color, or lighter and darker azurites. A few of the cases are more serious, because they represent distinct general classes of materials: (1) organic hide and light to medium olive green variant of malachite; (2) gum and a pine green variant of malachite; and (3) a dark variant of azurite and a pine green variant of malachite.
Of the five spectral bands that were used to characterize the 52 kinds of materials, two were found most essential in distinguishing among them through the discriminant function analysis: the green and the midrange-infrared band. These two bands dominated discriminant Functions 1 and 2, respectively, which together account for 92% of the brightness variability of the 52 material samples that distinguished them. The red and near- infrared bands were also found important, in that each correlated with and contributed significantly to a large number of discriminant functions–three of the five derived functions. The blue band was found to offer the least discriminating power among samples, and would be expected to be least helpful in general in distinguishing artistic imagery on the copper artifacts. Of course, it would be very helpful in the case of those (minority) artifacts where imagery was produced by creating a blue azurite patina (e.g. breastplate B050).
This quantitative finding differs somewhat from the a priori and qualitative assessments of band significance reported in Chapter 7. There, it is reported that we initially assumed a priori that green would be least important to the definition of artistic images because it would represent natural corrosion rather than pigments of other colors. Even when we understood that the images were produced primarily through copper patination, we weighted the green band less important than the red and blue bands. The significance of the red band to the definition of artistic images was recognized qualitatively in Chapter 7 and supported quantitatively here. The qualitative assessment made in Chapter 7 did not consider the midrange-infrared and near-infrared bands. The importance attributed to the blue band in defining of the artistic imagery differs in the qualitative and quantitative analyses; the qualitative study gave the blue channel more weight.
Archaeology Sources. 2001, Carr, Christopher, Development of High-Resolution, Digital, Color & Infrared Photographic Methods for Clarifying Imagery on Hopewellian Copper Artifacts, in Order to Investigate the Origins of Institutionalized, Suparlocal Leadership, funded NSF Proposal
1998, Sheenan, Stephen, Quantifying Archaeology, Edinburgh: Edinburgh University Press.
1994, Baxter, M.J., Exploratory Multivariate Analysis in Archaeology, Edinburgh: Edinburgh University Press.
GIS Sources, 2002, Bossler, John D., ed., Manual of Geospatial Science and Technology, New York: Taylor and Francis.
1997, Schowengerdt, Robert A., Remote Sensing Models and Methods for Image Processing, San Diego: Academic Press.
1992, Avery, Thomas Eugene, Fundamentals of Remote Sensing and Airphoto Interpretation, New York: McMillan
Statistical Sources, 1998, Hair, Joseph F., Jr., Multivariate Data Analysis, New Jersey: Prentice Hall.
1994, Hayes, William J., Statistics, Fort Worth: Harcourt Brace College Publishers.
1994, Kline, Paul, An Easy Guide to Factor Analysis, New York: Routledge.
1976, Green, Paul E., and Carrol, J. Douglas, Mathematical Tools for Applied Multivariate Analysis, New York: Academic Press.