Common algorithms for decision trees.
- Regression tree (MSE criterion)
- Classification tree (Gini or entropy criterion)
- Random forests
- Visualisation of results
- GBDT regression
- GBDT classification
- PCA
- SVD
- Gauss-Jordan
- PowerIteration
- InversePowerIteration
This project was written for practice and from curiosity and I wouldn't recommend using it in real applications. It lacks focus on the speed and generally constructs of Julia language (this is literally the first code I've ever written in Julia).
Nonetheless, you can use it to learn about implementations of the listed algorithms. While working on it, I was sometimes looking for other implementations on the Internet that would help me with debugging my own code. Interestingly for many algorithms here, you can't find a good reference. Usually, you can find implementations that are too complex and super-duper-optimized with the main idea burried, or you might find nothing at all.
If you look for topnotch implementations of the above I would suggest looking in DecisionTree.jl, LinearAlgebra, and XGBoost
The boosted trees were written with the following paper in mind and based on what was summarized in the following lecture. However, they don't use some optimization tricks and the heuristic for picking the best split might differ.
To run this, you will need Julia in your PATH (download it here).
To install all dependencies go to the project repository,
open REPL (just type julia) and enter pkg
interface by typing ]
.
In pkg interface invoke following commands
activate .
instantiate
From now on, the project is set up as a package (similarly to how it works in Python). You can start any example by
julia --project=PATH_TO_REPO PATH_TO_REPO/examples/EXAMPLE
or directly from REPL (if the project is already pkg activated) like:
include("PATH_TO_REPO/examples/EXAMPLE")
To load the project as a module, execute the following
include("PATH_TO_REPO/src/Strom.jl")
using .Strom
With this, you are ready to go. To see what you might do, it's best to look at the examples directory.
The project also provides plotting functionality. You can see some results bellow.