public class RowIdJob extends AbstractJob
document x terms
matrix.
The input data is in SequenceFile<Text,VectorWritable>
format (as generated by
SparseVectorsFromSequenceFiles
or by EncodedVectorsFromSequenceFiles
)
and generates the following two files as output:
SequenceFile<IntWritable,VectorWritable>
.SequenceFile<IntWritable,Text>
.RowIdJob
replaces the document text ids by integers.
The original document text ids can still be retrieved from the "docIndex".argMap, inputFile, inputPath, outputFile, outputPath, tempPath
Constructor and Description |
---|
RowIdJob() |
Modifier and Type | Method and Description |
---|---|
static void |
main(String[] args) |
int |
run(String[] args) |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, buildOption, buildOption, getAnalyzerClassFromOption, getCLIOption, getConf, getDimensions, getFloat, getFloat, getGroup, getInputFile, getInputPath, getInt, getInt, getOption, getOption, getOption, getOptions, getOutputFile, getOutputPath, getOutputPath, getTempPath, getTempPath, hasOption, keyFor, maybePut, parseArguments, parseArguments, parseDirectories, prepareJob, prepareJob, prepareJob, prepareJob, setConf, setS3SafeCombinedInputPath, shouldRunNextPhase
Copyright © 2008–2017 The Apache Software Foundation. All rights reserved.