Skip to content

DronovIlya/random-forest-java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Random forest, Java implementation

Random Forest is a bagging machine learning algorithm for combining multiple decision trees. The prediction is aggregated across all of trees.

The process of building Random Forest in this implementation:

  • Generate a boostrap sample with replacement from the training data
  • Build a tree for the boostrap data, by recursively repeating next steps
    • Randomly select variables from the full feature set
    • Using information gain pick the best split-point among selected features
    • Split the node into left and right child.
    • Split the node into left and right child.
    • Repeat it until the minimum node size _min_sample_leaf_ reached

Tuned parameters

  • n_estimators - Number of trees in the forest
  • min_samples_leaf - The minimum number of samples required to be at a leaf node
  • max_features - The number of features to consider when looking for best split

Comparison

There is some comparision between Java Random Forest and sklearn's Random Forest on Spine dataset. You may look at it in notebook/spine-RandomForest.ipynb

Releases

No releases published

Packages

No packages published