Blockifed DRM rdd (keys of original DRM are grouped into array corresponding to rows of Matrix object value
Row-wise organized DRM rdd type
Spark-specific non-drm-method operations
This validation contains distributed algorithms that distributed matrix expression optimizer picks from.
Adding Spark-specific ops
row key type
source rdd conforming to org.apache.mahout.sparkbindings.DrmRdd
optional, number of rows. If not specified, we'll try to figure out on our own.
optional, number of columns. If not specififed, we'll try to figure out on our own.
optional, desired cache policy for that rdd.
optional. For int-keyed rows, there might be implied but missing rows.
If underlying rdd may have that condition, we need to know since some
operators consider that a deficiency and we'll need to fix it lazily
before proceeding with such operators. It only meaningful if nrow
is
also specified (otherwise, we'll run quick test to figure if rows may
be missing, at the time we count the rows).
wrapped DRM
Another drmWrap version that takes in vertical block-partitioned input to form the matrix.
A drmWrap version that takes a DataFrame of Row[Double]
A drmWrap version that takes an RDD[org.
A drmWrap version that takes an RDD[org.apache.spark.mllib.regression.LabeledPoint] returns a DRM where column the label is the last column
A drmWrap Version that takes an RDD[org.
A drmWrap Version that takes an RDD[org.apache.spark.mllib.linalg.Vector]
Create proper spark context that includes local Mahout jars
Create proper spark context that includes local Mahout jars
Broadcast transforms
Public api for Spark-specific operators