DistributedEngine

Abstract Value Members

abstract def allreduceBlock[K](drm: CheckpointedDrm[K], bmf: ((Array[K], Matrix)) ⇒ Matrix, rf: (Matrix, Matrix) ⇒ Matrix): Matrix

Optional engine-specific all reduce tensor operation.
abstract def colMeans[K](drm: CheckpointedDrm[K]): Vector

Engine-specific colMeans implementation based on a checkpoint.
abstract def colSums[K](drm: CheckpointedDrm[K]): Vector

Engine-specific colSums implementation based on a checkpoint.
abstract def drm2IntKeyed[K](drmX: DrmLike[K], computeMap: Boolean = false): (DrmLike[Int], Option[DrmLike[K]])

Convert non-int-keyed matrix to an int-keyed, computing optionally mapping from old keys to row indices in the new one.
Convert non-int-keyed matrix to an int-keyed, computing optionally mapping from old keys to row indices in the new one. The mapping, if requested, is returned as a 1-column matrix.
abstract def drmBroadcast(m: Matrix)(implicit dc: DistributedContext): BCast[Matrix]

Broadcast support
abstract def drmBroadcast(v: Vector)(implicit dc: DistributedContext): BCast[Vector]

Broadcast support
abstract def drmDfsRead(path: String, parMin: Int = 0)(implicit sc: DistributedContext): CheckpointedDrm[_]

Load DRM from hdfs (as in Mahout DRM format).
Load DRM from hdfs (as in Mahout DRM format). <P/>
path
The DFS path to load from
parMin
Minimum parallelism after load (equivalent to #par(min=...)).
abstract def drmParallelizeEmpty(nrow: Int, ncol: Int, numPartitions: Int = 10)(implicit sc: DistributedContext): CheckpointedDrm[Int]

This creates an empty DRM with specified number of partitions and cardinality.
abstract def drmParallelizeEmptyLong(nrow: Long, ncol: Int, numPartitions: Int = 10)(implicit sc: DistributedContext): CheckpointedDrm[Long]

Creates empty DRM with non-trivial height
abstract def drmParallelizeWithRowIndices(m: Matrix, numPartitions: Int = 1)(implicit sc: DistributedContext): CheckpointedDrm[Int]

Parallelize in-core matrix as the backend engine distributed matrix, using row ordinal indices as data set keys.
abstract def drmParallelizeWithRowLabels(m: Matrix, numPartitions: Int = 1)(implicit sc: DistributedContext): CheckpointedDrm[String]

Parallelize in-core matrix as the backend engine distributed matrix, using row labels as a data set keys.
abstract def drmSampleKRows[K](drmX: DrmLike[K], numSamples: Int, replacement: Boolean = false): Matrix
abstract def drmSampleRows[K](drmX: DrmLike[K], fraction: Double, replacement: Boolean = false): DrmLike[K]

(Optional) Sampling operation.
(Optional) Sampling operation. Consistent with Spark semantics of the same.
K
drmX
fraction
replacement
returns
abstract def indexedDatasetDFSRead(src: String, schema: Schema = DefaultIndexedDatasetReadSchema, existingRowIDs: Option[BiDictionary] = None)(implicit sc: DistributedContext): IndexedDataset

Load IndexedDataset from text delimited format.
Load IndexedDataset from text delimited format.
src
comma delimited URIs to read from
schema
defines format of file(s)
abstract def indexedDatasetDFSReadElements(src: String, schema: Schema = ..., existingRowIDs: Option[BiDictionary] = None)(implicit sc: DistributedContext): IndexedDataset

Load IndexedDataset from text delimited format, one element per line
Load IndexedDataset from text delimited format, one element per line
src
comma delimited URIs to read from
schema
defines format of file(s)
abstract def norm[K](drm: CheckpointedDrm[K]): Double
abstract def numNonZeroElementsPerColumn[K](drm: CheckpointedDrm[K]): Vector

Engine-specific numNonZeroElementsPerColumn implementation based on a checkpoint.
abstract def toPhysical[K](plan: DrmLike[K], ch: CacheHint)(implicit arg0: ClassTag[K]): CheckpointedDrm[K]

Second optimizer pass.
Second optimizer pass. Translate previously rewritten logical pipeline into physical engine plan.

Concrete Value Members

final def !=(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def !=(arg0: Any): Boolean

Definition Classes
Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def ==(arg0: Any): Boolean

Definition Classes
Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def optimizerRewrite[K](action: DrmLike[K])(implicit arg0: ClassTag[K]): DrmLike[K]

First optimization pass.
First optimization pass. Return physical plan that we can pass to exec(). This rewrite may introduce logical constructs (including engine-specific ones) that user DSL cannot even produce per se. <P>
A particular physical engine implementation may choose to either use the default rewrites or build its own rewriting rules. <P>
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

trait DistributedEngine extends AnyRef

Abstract Value Members

abstract def allreduceBlock[K](drm: CheckpointedDrm[K], bmf: ((Array[K], Matrix)) ⇒ Matrix, rf: (Matrix, Matrix) ⇒ Matrix): Matrix

abstract def colMeans[K](drm: CheckpointedDrm[K]): Vector

abstract def colSums[K](drm: CheckpointedDrm[K]): Vector

abstract def drm2IntKeyed[K](drmX: DrmLike[K], computeMap: Boolean = false): (DrmLike[Int], Option[DrmLike[K]])

abstract def drmBroadcast(m: Matrix)(implicit dc: DistributedContext): BCast[Matrix]

abstract def drmBroadcast(v: Vector)(implicit dc: DistributedContext): BCast[Vector]

abstract def drmDfsRead(path: String, parMin: Int = 0)(implicit sc: DistributedContext): CheckpointedDrm[_]

abstract def drmParallelizeEmpty(nrow: Int, ncol: Int, numPartitions: Int = 10)(implicit sc: DistributedContext): CheckpointedDrm[Int]

abstract def drmParallelizeEmptyLong(nrow: Long, ncol: Int, numPartitions: Int = 10)(implicit sc: DistributedContext): CheckpointedDrm[Long]

abstract def drmParallelizeWithRowIndices(m: Matrix, numPartitions: Int = 1)(implicit sc: DistributedContext): CheckpointedDrm[Int]

abstract def drmParallelizeWithRowLabels(m: Matrix, numPartitions: Int = 1)(implicit sc: DistributedContext): CheckpointedDrm[String]

abstract def drmSampleKRows[K](drmX: DrmLike[K], numSamples: Int, replacement: Boolean = false): Matrix

abstract def drmSampleRows[K](drmX: DrmLike[K], fraction: Double, replacement: Boolean = false): DrmLike[K]

abstract def indexedDatasetDFSRead(src: String, schema: Schema = DefaultIndexedDatasetReadSchema, existingRowIDs: Option[BiDictionary] = None)(implicit sc: DistributedContext): IndexedDataset

abstract def indexedDatasetDFSReadElements(src: String, schema: Schema = ..., existingRowIDs: Option[BiDictionary] = None)(implicit sc: DistributedContext): IndexedDataset

abstract def norm[K](drm: CheckpointedDrm[K]): Double

abstract def numNonZeroElementsPerColumn[K](drm: CheckpointedDrm[K]): Vector

abstract def toPhysical[K](plan: DrmLike[K], ch: CacheHint)(implicit arg0: ClassTag[K]): CheckpointedDrm[K]

Concrete Value Members

final def !=(arg0: AnyRef): Boolean

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: AnyRef): Boolean

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def optimizerRewrite[K](action: DrmLike[K])(implicit arg0: ClassTag[K]): DrmLike[K]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped