This project contains a series of scripts to analyze and visualize a dataset of individuals of both sexes that are either labeled typical or ASD. The dataset contains ~200 features covering areas, thickness, volumes of different brain areas. These features are obviously correlated. Moreover, the dataset is somewhat imbalanced where femaleASD is a small set and maleASD makes up about ~50%
The scripts cover code that use different methods for feature projection ranging from PCA to manifold embedding to t-SNE. Linear Discriminant Analysis seems to be the most effective method for feature projection, at least from visual heuristics. The scripts also cover methods for analyzing the effectiveness of LDA as a classifier. LDA easily overfits and performs below expectations as a classifier. In this case, LDA seems better used for feature projection to 2 dimensions while other methods used in ensemble boost accuracy for classification.