org.apache.mahout.sparkbindings
Optional engine-specific all reduce tensor operation.
Optional engine-specific all reduce tensor operation.
Engine-specific colMeans implementation based on a checkpoint.
Engine-specific colMeans implementation based on a checkpoint.
Convert non-int-keyed matrix to an int-keyed, computing optionally mapping from old keys to row indices in the new one.
Convert non-int-keyed matrix to an int-keyed, computing optionally mapping from old keys to row indices in the new one. The mapping, if requested, is returned as a 1-column matrix.
Broadcast support
Broadcast support
Broadcast support
Broadcast support
Load DRM from hdfs (as in Mahout DRM format)
Load DRM from hdfs (as in Mahout DRM format)
spark context (wanted to make that implicit, doesn't work in current version of scala with the type bounds, sorry)
DRM[Any] where Any is automatically translated to value type
This creates an empty DRM with specified number of partitions and cardinality.
This creates an empty DRM with specified number of partitions and cardinality.
Parallelize in-core matrix as spark distributed matrix, using row ordinal indices as data set keys.
Parallelize in-core matrix as spark distributed matrix, using row ordinal indices as data set keys.
Parallelize in-core matrix as spark distributed matrix, using row labels as a data set keys.
Parallelize in-core matrix as spark distributed matrix, using row labels as a data set keys.
(Optional) Sampling operation.
(Optional) Sampling operation. Consistent with Spark semantics of the same.
Returns an org.apache.mahout.sparkbindings.indexeddataset.IndexedDatasetSpark from default text delimited files.
Returns an org.apache.mahout.sparkbindings.indexeddataset.IndexedDatasetSpark from default text delimited files. Reads a vector per row.
a comma separated list of URIs to read from
how the text file is formatted
Returns an org.apache.mahout.sparkbindings.indexeddataset.IndexedDatasetSpark from default text delimited files.
Returns an org.apache.mahout.sparkbindings.indexeddataset.IndexedDatasetSpark from default text delimited files. Reads an element per row.
a comma separated list of URIs to read from
how the text file is formatted
Perform default expression rewrite.
Perform default expression rewrite. Return physical plan that we can pass to exec(). <P>
A particular physical engine implementation may choose to either use or not use these rewrites as a useful basic rewriting rule.<P>
Second optimizer pass.
Second optimizer pass. Translate previously rewritten logical pipeline into physical engine plan.
Spark-specific non-drm-method operations