A NEW USE OF TARGET FACTOR ANALYSIS (TFA)

A NEW USE OF TARGET FACTOR ANALYSIS (TFA) John H. Kalivas, Kevin Higgins Department of Chemistry Idaho State University Pocatello, Idaho 83209 USA Erik Andries Department of Mathematics Central New Mexico Community College Albuquerque, New Mexico, Idaho 87106 USA

Classification Situation • Numerous classification approaches • KNN, LDA, MD, ANN, SVM, … • As the number of classes increases for a problem, the more difficult classification can become • Target factor analysis (TFA) and net analyte signal (NAS) • TFA and NAS have concurrent calculations of analogous angles between a test sample vector and respective spaces spanned by library classes • Useful for binary or multiclass situations

Requirements • Xi = m × nlibrary information matrix for the ith class • m = number of samples • n = number of measurements • Wavelengths for spectra, other physical or chemical variables • Samples making up a library class must span variances making up the class • Instrument profile, temperature effects, measurement process, others • y = m × 1 test sample measurement vector

Orthogonal Projection Spatial Angle (OPSA) • Identical to TFA and NAS • Use same orthogonal projection y

Process • No data preprocessing • Perform SVD of each library class • Retain d eigenvectors (class-wise) where 1 ≤ d ≤ k and k = rank(X) ≤ min(m,n) • Compute OPSA, MD, and KNN for the test sample relative to each library class • Use leave one out cross-validation (LOOCV) • Library class with smallest angle or MD is the test sample classification • KNN classification trends evaluated

Assessment • Accuracy = (TP + TN)/(TP +TN + FP + FN) • TP = true positives • TN = true negatives • FP = false positives • FN = false negatives • Receiver operator characteristic (ROC) • True positive rate = sensitivity = TP/(TP + FN) • False positive rate = 1- specificity = 1 – TN/(TN + FP)

Determining Eigenvectors • Numerous approaches exist to determine the minimum number of eigenvectors to span X • Determination of rank by augmentation (DRAUG) • Malinowski ER. J. Chemom. 2011; 25: 323-328 • Distinguishes primary eigenvectors (chemical, instrumental, etc.) from secondary eigenvectors (experimental error) independent of the experimental uncertainties distribution

Plastic Data • Six classes (six of seven commercial plastic types 1-6) • Allen V, Kalivas JH, Rodriguez RG. Applied Spec. 1999; 53: 672-681 • Raman spectroscopy (850 – 1800 cm-1, 1093 wavenumbers per spectrum) • Type 1 = polyethylene terephthalate (PET); 30 samples • Type 2 = high-density polyethylene (HDPE); 29 samples • Type 3 = polyvinyl chloride (PVC); 13 samples • Type 4 = low-density polyethylene (LDPE); 22 samples • Type 5 = polypropylene (PP); 23 samples • Type 6 = polystyrene (PS); 29 samples

Plastic Score and Scree Plots Scree Plot Score Plot Type 1 Type 2 Type 3 Type 4 Type 5 Type 6 • Unique clusters are not formed • Most of the spectral variance is captured with the first eigenvector

Plastic Classification Results ROC Plot Numbers indicate number of eigenvectors OPSA MD Total Accuracy Across All Classes KNN Specificity Sensitivity Accuracy aParenthesis values are DRAUG eigenvector number rounded to nearest whole number

Archeological Data • Four classes (four archeological sources of obsidian) • Kowalski BR, Schatzki TF, Stross FH. Anal. Chem. 1972; 44: 2176-2180 • 10 trace metal concentrations from X-ray fluorescence spectroscopy (Fe, Ti, Ba, Ca, K, Mn, Rb, Sr, Y, and Zr) • Source 1 = 10 samples • Source 2 = 9 samples • Source 3 = 23 samples • Source 4 = 21 samples

Archeological Classification Results Score Plot Scree Plot OPSA MD Source 1 Source 2 Source 3 Source 4 Total Accuracy Across All Classes KNN Specificity Sensitivity Accuracy aParenthesis values are DRAUG eigenvector number rounded to nearest whole number

Gasoil Data • Three classes (three commercial sources of gasoil) • WentzellP, Andrews D, Walsh J, Cooley J, Spencer P. Can. J. Chem. 1999; 77: 391-400 • Ultraviolet spectroscopy (200 – 400 nm, 572 wavelengths per spectrum) • Source 1 = 59 samples • Source 2 = 25 samples • Source 3 = 30 samples

Gasoil Classification Results Scree Plot Score Plot OPSA MD Source 1 Source 2 Source 3 Total Accuracy Across All Classes Specificity Sensitivity Accuracy KNN aParenthesis values are DRAUG eigenvector number rounded to nearest whole number

Extra Virgin Olive Oil (EVOO) Data • Six classes (six adulterant oils) • PoulliKI, Mousdis GA, Georgiou CA. Food Chem. 2007; 105: 369-375 • Synchronous fluorescence spectroscopy (250 – 400 nm at Δ20nm,151 wavelengths per spectrum) • Adulterant 1 = corn • Adulterant 2 = olive-pomace • Adulterant 3 = soybean • Adulterant 4 = sunflower • Adulterant 5 = rapeseed • Adulterant 6 = walnut • 31 samples each at 0.5 to 95 % adulterant

EVOO Classification Results Scree Plot Score Plot OPSA MD Corn, Olive-pomace, Rapeseed, Soybean, Sunflower, Walnut Total Accuracy Across All Classes KNN Specificity Sensitivity Accuracy

EVOO Concentrations Concentration Coded Score Plot % Sunflower aParenthesis values are DRAUG eigenvector number rounded to nearest whole number Score Plot Corn, Olive-pomace, Rapeseed, Soybean, Sunflower, Walnut

Summary • TFA or NAS angular measure OPSA out-performs MD and KNN over a variety of data sets • If normalize y to unit length, same results if use (TFA) • Score plots need not be obvious • Need to determine number of eigenvectors (basis vectors) to characterize each library class • Samples making up a library class need to span variances making up that library class • Instrument profile • Temperature effects • Others

A NEW USE OF TARGET FACTOR ANALYSIS (TFA)

A NEW USE OF TARGET FACTOR ANALYSIS (TFA)

Presentation Transcript

Factor Analysis

Factor Analysis

Factor Analysis

Factor Analysis

Achebe’s Response to Heart of Darkness

Overview of Factor Analysis

Analysis of Section I of the WTO TFA

Factor Analysis

WCO Working Group on the Trade Facilitation Agreement

TFA

LaBour - TFA

Factor Analysis

TFA Signs

A New Target for CC Evaluation A New Target for CC EvaluationA New Target for CC Evaluation A New Target for CC Evaluati

Factor Analysis

WTO Trade Facilitation The Trade Facilitation Agreement Why the TFA?

Saint Louis TFA Financial Aid Open the TFA Financial Aid (FA) Guide

FACTOR ANALYSIS