Interface | Description |
---|---|
SplitInput.SplitCallback |
Used to pass information back to a caller once a file has been split without the need for a data object
|
Class | Description |
---|---|
Bump125 |
Helps with making nice intervals at arbitrary scale.
|
MatrixDumper |
Export a Matrix in various text formats:
* CSV file
Input format: Hadoop SequenceFile with Text key and MatrixWritable value, 1 pair
TODO:
Needs class for key value- should not hard-code to Text.
|
SequenceFileDumper | |
SplitInput |
A utility for splitting files in the input format used by the Bayes
classifiers or anything else that has one item per line or SequenceFiles (key/value)
into training and test sets in order to perform cross-validation.
|
SplitInputJob |
Class which implements a map reduce version of SplitInput.
|
SplitInputJob.SplitInputComparator |
Randomly permute key value pairs
|
SplitInputJob.SplitInputMapper |
Mapper which downsamples the input by downsamplingFactor
|
SplitInputJob.SplitInputReducer |
Reducer which uses MultipleOutputs to randomly allocate key value pairs between test and training outputs
|
Copyright © 2008–2017 The Apache Software Foundation. All rights reserved.