public class SequenceFilesFromDirectory extends AbstractJob
SequenceFile
s of docid => content. The docid is set as the relative path of the
document from the parent directory prepended with a specified prefix. You can also specify the input encoding
of the text files. The content of the output SequenceFiles are encoded as UTF-8 text.Modifier and Type | Field and Description |
---|---|
static String |
BASE_INPUT_PATH |
static String[] |
FILE_FILTER_CLASS_OPTION |
static String[] |
KEY_PREFIX_OPTION |
argMap, inputFile, inputPath, outputFile, outputPath, tempPath
Constructor and Description |
---|
SequenceFilesFromDirectory() |
Modifier and Type | Method and Description |
---|---|
protected void |
addOptions()
Override this method in order to add additional options to the command line of the SequenceFileFromDirectory job.
|
static void |
main(String[] args) |
protected Map<String,String> |
parseOptions()
Override this method in order to parse your additional options from the command line.
|
int |
run(String[] args) |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, buildOption, buildOption, getAnalyzerClassFromOption, getCLIOption, getConf, getDimensions, getFloat, getFloat, getGroup, getInputFile, getInputPath, getInt, getInt, getOption, getOption, getOption, getOptions, getOutputFile, getOutputPath, getOutputPath, getTempPath, getTempPath, hasOption, keyFor, maybePut, parseArguments, parseArguments, parseDirectories, prepareJob, prepareJob, prepareJob, prepareJob, setConf, setS3SafeCombinedInputPath, shouldRunNextPhase
public static final String[] FILE_FILTER_CLASS_OPTION
public static final String[] KEY_PREFIX_OPTION
public static final String BASE_INPUT_PATH
Copyright © 2008–2017 The Apache Software Foundation. All rights reserved.