Machine Learning / Applied Statistics
- Decisions tree: how do we determine if X is y_i, for y_i E Y? You can make a decision tree, which walks over possiblities. Usually, these trees are binary. You can use the tree to test new (partial) datapoints: based on old data, which made you construct the tree a certain way, how would the new datapoint be classified? link visialize decision trees
- Random forests build upon decision trees, and combine multiple decision trees, which should be as uncorrelated as possible. The idea is multiple (weak) classifiers combined might produce one (better) classifier. link1, link2
- Isolation forests: how distant is a point from other points? Or: how many random cuts does it take to isolate a point from the dist? Large distances/large nb of cuts should find outliers.
Tutorials / writeups