Machine Learning Feature
Table of contents
Introduction
Machine Learning Feature groups algorithms that reduce dimensionality, cluster spectra, or classify labeled spectra. Principal Components Analysis (PCA) and T-SNE Dimensionality Reduction help visualize high-dimensional spectra, Hierarchically-clustered Heatmap groups similar spectra, and Random Forest(RF), K-Nearest Neighbors(KNN), and Support Vector Machine(SVM) perform supervised classification from labels created during upload.
Label setup
- Open Data Upload.
- Set Number of classes. Use one class for unlabeled exploratory workflows, or two or more classes when you plan to run supervised classification.
- Load or query data inside each Class expander. If you upload more than one class, SpectraGuru interpolates the classes to a shared Raman shift grid before combining them.
- Use the label editor after upload to review the Spectrum and Label columns. Labels are stored as the class values used by PCA coloring and by Random Forest, KNN, and SVM classification.
- Continue to Processing Page if preprocessing is needed, then open Analytics Page and select a machine learning feature from Select Analytics Plot.
Classification pages require every selected spectrum to have a label and require at least two unique classes.
Included methods
| Method | Use |
|---|---|
| Principal Components Analysis (PCA) | Linear dimensionality reduction and loading inspection |
| T-SNE Dimensionality Reduction | Nonlinear 2D neighborhood visualization |
| Hierarchically-clustered Heatmap | Ward-linkage clustering and heatmap/dendrogram display |
| Random Forest(RF) Classification | Supervised ensemble classification |
| K-Nearest Neighbors(KNN) Classification | Supervised distance-based classification |
| Support Vector Machine(SVM) Classification | Supervised margin-based classification |