RowSimilarityDriver

Command line interface for ). Reads a text delimited file containing rows of a org.apache.mahout.math.indexeddataset.IndexedDataset with domain specific IDS of the form (row id, column id: strength, ...). The IDs will be preserved in the output. The rows define a matrix and ) will be used to calculate row-wise similarity using log-likelihood. The options allow control of the input schema, file discovery, output schema, and control of algorithm parameters.

To get help run

mahout spark-rowsimilarity

for a full explanation of options. The default values for formatting will read (rowID<tab>columnID1:strength1<space>columnID2:strength2....) and write (rowID<tab>rowID1:strength1<space>rowID2:strength2....) Each output line will contain a row ID and similar columns sorted by LLR strength descending. mahout spark-rowsimilarity }}} values for formatting will read (rowID<tab>columnID1:strength1<space>columnID2:strength2....) and write (rowID<tab>rowID1:strength1<space>rowID2:strength2....) Each output line will contain a row ID and similar columns sorted by LLR strength descending.

Note: To use with a Spark cluster see the --master option, if you run out of heap space check the --sparkExecutorMemory option.

Linear Supertypes

MahoutSparkDriver, MahoutDriver, AnyRef, Any

Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
var _useExistingContext: Boolean

Definition Classes
MahoutDriver
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def main(args: Array[String]): Unit

Entry point, not using Scala App trait
Entry point, not using Scala App trait
args
Command line args, if empty a help message is printed.

Definition Classes
RowSimilarityDriver → MahoutDriver
implicit var mc: DistributedContext

Attributes
protected
Definition Classes
MahoutDriver
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
implicit var parser: MahoutOptionParser

Attributes
protected
Definition Classes
MahoutDriver
def process(): Unit

Definition Classes
RowSimilarityDriver → MahoutDriver
implicit var sparkConf: SparkConf

Definition Classes
MahoutSparkDriver
def start(): Unit

Creates a Spark context to run the job inside.
Creates a Spark context to run the job inside. Override to set the SparkConf values specific to the job, these must be set before the context is created.

Attributes
protected
Definition Classes
RowSimilarityDriver → MahoutSparkDriver → MahoutDriver
def stop(): Unit

Attributes
protected
Definition Classes
MahoutDriver
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def useContext(context: DistributedContext): Unit

Call this before start to use an existing context as when running multiple drivers from a scalatest suite.
Call this before start to use an existing context as when running multiple drivers from a scalatest suite.
context
An already set up context to run against

Definition Classes
MahoutSparkDriver
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

object RowSimilarityDriver extends MahoutSparkDriver

Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

var _useExistingContext: Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

def main(args: Array[String]): Unit

implicit var mc: DistributedContext

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

implicit var parser: MahoutOptionParser

def process(): Unit

implicit var sparkConf: SparkConf

def start(): Unit

def stop(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

def useContext(context: DistributedContext): Unit

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from MahoutSparkDriver

Inherited from MahoutDriver

Inherited from AnyRef

Inherited from Any

Ungrouped