Skip navigation links

Package org.apache.mahout.classifier.df.mapreduce.partial

Partial-data mapreduce implementation of Random Decision Forests

See: Description

Package org.apache.mahout.classifier.df.mapreduce.partial Description

Partial-data mapreduce implementation of Random Decision Forests

The builder splits the data, using a FileInputSplit, among the mappers. Building the forest and estimating the oob error takes two job steps.

In the first step, each mapper is responsible for growing a number of trees with its partition's, loading the data instances in its map() function, then building the trees in the close() method. It uses the reference implementation's code to build each tree and estimate the oob error.

The second step is needed when estimating the oob error. Each mapper loads all the trees that does not belong to its own partition (were not built using the partition's data) and uses them to classify the partition's data instances. The data instances are loaded in the map() method and the classification is performed in the close() method.

Skip navigation links

Copyright © 2008–2017 The Apache Software Foundation. All rights reserved.