Discriminate Analysis of Synthetic Vision System Equivalent Safety Metric 4 (SVS-ESM-4) Cicely J. Daye Morgan State University Louis Glaab Aviation Safety and Security, SVS GA Discriminate Analysis of Synthetic Vision System Equivalent Safety Metric 4 (SVS-ESM-4) 1
Cicely Daye NASA Langley Research Center, 24 West Taylor St., Hampton, VA, USA 23681 ABSTRACT Discriminate Analysis is an advance statistical procedure similar to multiple regression analysis for identifying the relationships between qualitative criterion variables and quantitative predictor variables. For the Synthetic Vision System Equivalent Safety Metric (ESM) development, discriminate analysis will be used to establish the weighting factors to be applied the Equivalent Safety Metric. Previous ESM development employed inputs from the evaluation pilots to establish weighting factors that worked well and reflected pilots inputs towards safety, but were not a statistically rigorous method. Ultimately the weighting factors that would enable the ESM to indicate the largest differences between the various safety points in the data is desired. This effort will retain the 4 primary data elements of the previous ESM development work (i.e., workload, situation awareness (SA), readability and % time safety tunnel). The data used for the Discriminate Analysis, was collected from the piloted simulation research concluded in November 2004. These four data variables are considered discriminator variables. The identifier variables are the evaluation pilots with different display concepts, meteorological conditions, and maneuvers. Each group or test condition represents fundamentally different types of procedures and associated rates of accidents. The three major test conditions of focus are the VFR pilots flying the Instrument Landing System (ILS) approach using baseline round-dial (BRD) display concepts, which is considered unsafe, the IFR pilots flying the ILS approach using BRD, which is considered acceptably safe, and H-IFR pilots flying the VFR approach using BRD, which is considered very safe. Keywords: SVS- ESM 4, equivalent safety, analysis, classification 2
Introduction The idea of Synthetic Vision Systems (SVS) is to provide pilots with a continuous view of terrain combined with integrated guidance symbology in an effort to increase situation awareness (SA) and decrease workload during operations in Instrumental Meteorological Conditions (IMC). Previous piloted simulations have shown that SA increased and workload decreased but none have attempted to show the major contributions to safety and operational benefit Previous efforts focused on developing an equivalent safety metric (ESM) employed qualitative inputs from evaluation pilots to determine weighting factors (see reference SPIE paper). While this method provided reasonable results, a method to more formally and quantitatively establish the weighting factors to be applied to form the metric was warranted. This establishment of weighting factors can also be used to quantitatively establish zones for various levels of safety. Discriminant analysis was used to quantify the relationship between selected components of safety that are associated with flying and approach. Discriminate Analysis (DA) is a statistical procedure that is used to discriminate between two or more specified categories. Specifically, it is used to establish the weighting factors to be applied to multiple variables that make up a function such that the function can discriminate between two or more specified categories. For example, a medical researcher may want to investigate which variables discriminate between patients taking the same medication who (1) recover completely, (2) recover partially, or (3) don t recover. DA could be used to establish the weighting factors to be applied to multiple variables (e.g., sex, age, blood type, etc.) that make up a function such that the function can discriminate between the recovery categories. DA is available in many forms from linear to multivariate to quadratic. Linear DA is used when dealing with only two categories and is analogous to multiple regression. The more generalized linear form, for multiple groups, is called Multiple Discriminant Analysis (MDA). Quadratic Discriminant Analysis utilizes nonlinear functions. MDA was applied to the SVS ESM-4 data to establish the weighting factors to be applied to Equivalent Safety Metric-5 (ESM-5). From the weighting factors, a prediction equation is available for future use to determine the level of safety SVS flight simulations. Procedure MDA is frequently used when many measures (discriminator variables) are available in a study in order to determine the ones that assist in the discrimination of the groups. Those variables which help are included in the model (L = b 1 x 1 + b 2 x 2 +... + b n x n + c, where L is the criterion variable, the b's are Discriminant coefficients, the x's are discriminating variables, and c is a constant), while those that do not help are not included. Just as is the case in multiple regressions, MDA can be conducted with stepwise procedures - both forward and backward analyses are available. In the forward step 3
procedure, the model is built step-by-step. During the first step, all discriminator variables are evaluated to determine which one will contribute most to the discrimination between groups, and that variable is then included in the model. The following steps repeat the procedure over and over, adding a new variable for each step, until some criterion is met. In the backward step procedure, all variables are included in the model and then, at each step, the variable that contributes least to the prediction of group membership is eliminated, again until some criterion is met. If there are a large number of predictor variables and there is no preconceived model or function to test, than the stepwise method is most likely the best method to use. Also, the best way to analyze large amounts of data is often by parts. It should be noted that the stepwise procedures tend to exaggerate the significance levels, since the procedures exploit chance associations and "pick and choose" the variables to be included in the model so as to maximize discrimination. A reported significance level of.05 may in reality be much worse (greater than.05). The output of a MDA usually has four parts (1) preliminary statistics, (2) significance and strengths of relationship statistics for each Discriminant function, (3) Discriminant function coefficients and (4) Group classification. Preliminary statistics describes each group s differences and covariance. The form is usually means and standard deviations of each identifier variable group, Analysis of variance (ANOVA) for each identifier variable and covariance and Box s M test. Covariance and Box s M are indicators of whether significant differences exist in the covariance matrices among groups. It is assumed that the covariance matrices of variables are homogeneous across groups; however, as in many statistical procedures, the multivariate Box M test for homogeneity of variances/covariances is particularly sensitive to deviations from multivariate normality. MDA is quite robust and the Box M test should not be taken too seriously. Significance and strength of relationship statistics is given in the form o Wilks Lambda and eigenvalues. Wilks Lambda shows the significance of each component in the function and eigenvalues show the importance of each function. In the output of the MDA the first eigenvalues will be the largest and most important in the classification process. The eigenvalues reflects the percent of variance explained in the dependent variable, adding to 100% for all functions. Wilks's lambda is used in several contexts in MDA. First, Wilks's lambda is used in an ANOVA (F) test of mean differences. The smaller the lambda for an independent (discriminator) variable, the more that variable contributes to the discriminant function. The F test of Wilks' lambda thus shows which variables contributions are significant. Wilks' lambda is used in the second context to test the significance of each discriminant function in MDA. Specifically, the significance of the eigenvalues for each function is what is tested and it is a measure of the difference between groups of the centroid (vector) of means of the discriminator variables. The smaller the lambda, the greater the differences. Lambda varies from 0 to 1, with 0 indicating group means differ (thus the more the function differentiates the groups), and 1 indicating that all group means are the same. 4
Discriminant function coefficients are presented as unstandardized and standardized function coefficients. The standardized coefficients represent the identifier variable s contribution to each function. The unstandardized function coefficients when multiplied by the values of an observation, project and individual on its discriminant axis or centroid. Group classification is used to determine the extent of to which group differences support the function generated by reviewing group means for each function as presented in the table of Functions at Group Centroids. Group centroids are the mean discriminant scores for each group for each discriminant function. Discussion/ Results For the SVS ESM-5 multiple discriminant analysis (MDA) was used to establish the weighting factors. Ultimately these are the weighting factors that enabled the ESM to indicate the largest differences between the various safety categories in our data. The SPSS statistical software package was used for the SVS ESM-5 MDA, The output is similar to the one describe in the preceding section. A combination of subjective and objective data measures were used in a preliminary research completed in November 2004 to quantify the relationship between selected components of safety that are associated with flying an approach. Four information display methods ranging from a round dials baseline through a fully integrated SVS package were investigated in this high fidelity flight experiment. In addition a large spectrum of general aviation (GA) pilots were employed for listing an attempt to enable greater application of results and determine if an equivalent level of safety are achievable through the incorporation of SVS technology regardless of pilots flight experience. The SVS display provides commercial & general aviation pilots with clear-day operations all of the time 5
The major performance measures, the independent variables, in our discriminate analysis were workload, situation awareness (SA), readability and % time safety tunnel, from the data collected from the piloted simulation research concluded in November. These four measures are considered discriminator variables. The identifier variables are the evaluation pilots with different display concepts, meteorological conditions, and maneuvers. These variables are used to identify group membership. For the SVS ESM experiment, the groups represent fundamentally different types of procedures and associated rates of accidents. The three major groups of focus are the VFR pilots flying the Instrument Landing System (ILS) approach using baseline round-dial (BRD) display concepts, which is considered unsafe; the IFR pilots flying the ILS approach using BRD, which is considered acceptably safe; and H-IFR pilots flying the VFR approach using BRD, which is considered very safe. The MDA produced two criterion variables or functions that can be used to discriminate between the three focus groups (unsafe, safe, and very safe). The first function, which is the most powerful differentiating function, maximizes the differences between the values for the three groups. The second function, which is a less powerful differentiating function, is orthogonal to the first function (and uncorrelated with it) and maximizes the differences between the values for the three groups, controlling for the first factor. Preliminary Statistics The SPSS output of preliminary statistics is the Analysis Case Processing Summary and the Group Statistics. The Analysis Case Summary output box displays the unweighted valid and excluded cases. The Group Statistics output box displays the mean and standard deviations of each identifier variable in each case. It also displays the weighted and unweighted values. Analysis Case Processing Summary Unweighted Cases Valid Excluded Missing or out-of-range group codes At least one missing discriminating variable Both missing or out-of-range group codes and at least one missing discriminating variable Total Total N Percent 25 100.0 0.0 0.0 0.0 0.0 25 100.0 6
TESTCO_A H-IFR Pilots flying VF VFR Pilots flying ILS IFR Pilots flying ILS Total DIRECTMC DIRECTMC DIRECTMC DIRECTMC Group Statistics Valid N (listwise) Mean Std. Deviation Unweighted Weighted 87.56 44.836 9 9.000 8.11 1.833 9 9.000 54.89 12.005 9 9.000 93.56 6.464 9 9.000-13.88 45.846 8 8.000 4.75 2.550 8 8.000 37.13 9.203 8 8.000 48.25 33.906 8 8.000 7.50 59.759 8 8.000 6.25 1.669 8 8.000 40.38 10.056 8 8.000 78.13 31.787 8 8.000 29.48 66.160 25 25.000 6.44 2.417 25 25.000 44.56 12.904 25 25.000 74.12 31.844 25 25.000 Significance and Strength of Relationship Statistics The Test of Equality of Group Means output box gives the Wilks lambda and F statistic along with the significance of each discriminator variable. Variables that are not found to be significant may be considered for elimination from the discriminant functions. Tests of Equality of Group Means DIRECTMC Wilks' Lambda F df1 df2 Sig..531 9.711 2 22.001.656 5.779 2 22.010.614 6.915 2 22.005.635 6.322 2 22.007 The Pooled Within Group Matrices output box shows the correlation between the variables. It consists of a mirror image across the right diagonal. Pooled Within-Groups Matrices Correlation DIRECMCH DIRECMCH 1.000.283.369.637.283 1.000.491.083.369.491 1.000.330.637.083.330 1.000 The Summary of Canonical Discriminant Functions gives the eigenvalues and Wilks lambda of each function. The eignenvalues show how much of the variance between the groups is accounted for by each of the functions. Wilks's lambda tests the significance of each function. Function 1 2 Eigenvalues Canonical Eigenvalue % of Variance Cumulative % Correlation 1.207 a 91.3 91.3.739.115 a 8.7 100.0.321 a. First 2 canonical discriminant functions were used in the analysis. 7
Wilks' Lambda Test of Function(s) 1 through 2 2 Wilks' Lambda Chi-square df Sig..407 18.453 8.018.897 2.226 3.527 Discriminant Function Coefficients The Canonical Discriminate Function Coefficients are the coefficients of the variables and indicate the relative importance of the independent variables in predicting the classifications. They should be used to assess each independent variable's unique contribution to the discriminant function. They can be used to compute the canonical variable score. This is done by multiplying the coefficient by the variable and adding each canonical variable together. Canonical Discriminant Function Coefficients DIRECMCH (Constant) Unstandardized coefficients Function 1 2.009 -.012.015.034.089.013.031 -.011-3.350-1.786 Standardized Canonical Discriminant Function Coefficients DIRECMCH Function 1 2.455 -.628.406.912.182.027.326 -.114 The Structure Matrix provides another way to study the usefulness of each variable in the discriminate function. The structure coefficients indicate the simple correlations between the variables and the discriminant functions. The coefficients may be used to assign meaningful labels to the discriminant functions. 8
Structure Matrix DIRECMCH Function 1 2.845* -.433.710* -.429.657*.205.652.738* Pooled within-groups correlations between discriminating variables and standardized canonical discriminant functions Variables ordered by absolute size of correlation within function. *. Largest absolute correlation between each variable and any discriminant function Group Classification The Functions at Group Centroids gives the average discriminate scores for each test condition or group. The cutting points that set the ranges of the discriminant scores for classifying cases are determined from the group centroids. A group centroid is the mean value for the discriminant scores for a given group. Functions at Group Centroids TESTCO_A H-IFR Pilots flying VFR VFR Pilots flying ILS IFR Pilots flying ILS Function 1 2 1.290 -.146-1.168 -.291 -.283.455 Unstandardized canonical discriminant functions evaluated at group means 2 Canonical Discriminant Functions 1 0 IFR Pilots flying IL VFR Pilots flying IL H-IFR Pilots flying -1 TESTCO_A Group Centroids Function 2-2 -3-4 -3-2 -1 0 1 2 3 IFR Pilots flying IL VFR Pilots flying IL H-IFR Pilots flying Function 1 Included among the Classification Statistics, which indicate which cases were processed in the analysis, is Prior Probabilities for Groups. 9
Prior Probabilities for Groups TESTCO_A H-IFR Pilots flying VFR VFR Pilots flying ILS IFR Pilots flying ILS Total Cases Used in Analysis Prior Unweighted Weighted.333 9 9.000.333 8 8.000.333 8 8.000 1.000 25 25.000 The final output, Classification Results, measures the degree of success of the discriminant functions, and if they work equally well for each group. Original Count % TESTCO_A H-IFR Pilots flying VFR VFR Pilots flying ILS IFR Pilots flying ILS H-IFR Pilots flying VFR VFR Pilots flying ILS IFR Pilots flying ILS Classification Results a H-IFR Pilots flying VFR Approach with BRD a. 72.0% of original grouped cases correctly classified. Predicted Group Membership VFR Pilots flying ILS Approach with BRD IFR Pilots flying ILS Approach with BRD Total 8 0 1 9 0 4 4 8 1 1 6 8 88.9.0 11.1 100.0.0 50.0 50.0 100.0 12.5 12.5 75.0 100.0 Classification Function Coefficients, which can be used to classify new observations. DIRECMCH (Constant) Classification Function Coefficients H-IFR Pilots flying VFR Approach TESTCO_A VFR Pilots flying ILS Approach IFR Pilots flying ILS Approach with BRD with BRD with BRD -8.678E-02 -.107 -.109.129 8.601E-02.125.752.531.620.681.606.626-25.051-16.437-20.152 Fisher's linear discriminant functions Classification Results present the predicted group membership compared to the original membership and the accuracy p for each prediction. The bottom line of discriminate analysis is the accuracy of prediction. By using the standardized discriminate function coefficients one can attempt to quantify the relative importance of each predictor. The accuracy of prediction in the ESM-4 discriminate analysis shows us that 72% of the original group cases were correctly classified. 10
Some differences shown in the output method is that when separating groups and using the stepwise methods the accuracy of prediction increases for the cases of comparing the safe vs. very safe condition and the unsafe vs. very safe conditions. Using the stepwise method in the cumulative analysis using the stepwise method the accuracy of the prediction increases. For the safe vs. unsafe analysis the SPSS program returns an error at the stepwise analysis due to the fact that the data does not fit the criteria at the initial step. Test Conditions Accuracy of Prediction Accuracy of Prediction (stepwise) Safe vs. unsafe 68.8% Insufficient data Safe vs. very safe 82.4% 88.2% Unsafe vs. very safe 94.1% 100% All test condition 72.0% 60.0% The MDA produced desired Standardized Canonical Discriminant Function Coefficients, which was the main goal. The desired results of the coefficient were to have TLX; MCH and %time safety tunnel equivalently contributing to the discriminant function while is the largest contributor. SPSS returned as desired but MCH score was the lowest out of all. The next step in determining the equivalent level of safety was to apply the Classification Function Coefficients to the other SVS cases to determine the equivalent level of safety. When applying the Classification Function Coefficients one should look for the highest score out of all three cases. For the 275 cases, 6 cases were considered unsafe, 61 were considered safe and 153 were considered very safe. Conclusion In conclusion the MDA was able to provide us with the weighting factor and classification coefficients we were looking for. These results provide us with an opportunity to continue research with SVS and use the ESM-5 for various testing procedures. The results also allow for the safety classification of current and future Synthetic Vision Systems flight test. 11
References Jackson, Barbara Bund. Multivariate Data Analysis: An Introduction Homewood, Illinois: Richard C. Irwin, INC., 1983 pp.89-110 Morrison, Donald F. Multivariate Statistical Methods New York: McGraw- Hill, Inc. 1976 McLachlan Geoffrey M. Discriminant Analysis and Statistical Pattern Recognition New York: John Wiley & Sons, Inc. 1992 SPSS Training: Market Segmentation Using SPSS. Chicago, Illinois: SPSS, Inc., 1999 Chapter 5 SPSS Base 10.0 Application Guide. Chicago, Illinois: SPSS Inc., 1999 Chapter 14 12