You’ve probably already noticed Mahout has a lot of things going on at different levels, and it can be hard to know where to start. Let’s provide an overview to help you see how the pieces fit together. In general the stack is something like this:

1. Application Code
2. Samsara Scala-DSL (Syntactic Sugar)
3. Logical/Physical DAG
4. Engine Bindings
5. Code runs in Engine
6. Native Solvers

## Application Code

You have an JAVA/Scala applicatoin (skip this if you’re working from an interactive shell or Apache Zeppelin)

def main(args: Array[String]) {

println("Welcome to My Mahout App")

if (args.isEmpty) {


This may seem like a trivial part to call out, but the point is important- Mahout runs inline with your regular application code. E.g. if this is an Apache Spark app, then you do all your Spark things, including ETL and data prep in the same application, and then invoke Mahout’s mathematically expressive Scala DSL when you’re ready to math on it.

## Samsara Scala-DSL (Syntactic Sugar)

So when you get to a point in your code where you’re ready to math it up (in this example Spark) you can elegently express yourself mathematically.

implicit val sdc: org.apache.mahout.sparkbindings.SparkDistributedContext = sc2sdc(sc)

val A = drmWrap(rddA)
val B = drmWrap(rddB)

val C = A.t %*% A + A %*% B.t


We’ve defined a MahoutDistributedContext (which is a wrapper on the Spark Context), and two Disitributed Row Matrices (DRMs) which are wrappers around RDDs (in Spark).

## Logical / Physical DAG

At this point there is a bit of optimization that happens. For example, consider the

A.t %*% A


Which is

$$\mathbf{A^\intercal A}$$

Transposing a large matrix is a very expensive thing to do, and in this case we don’t actually need to do it. There is a more efficient way to calculate $$\mathbf{A^\intercal A}$$ that doesn’t require a physical transpose.

(Image showing this)

Mahout converts this code into something that looks like:

OpAtA(A) + OpABt(A, B) //  illustrative pseudocode with real functions called


There’s a little more magic that happens at this level, but the punchline is Mahout translates the pretty scala into a a series of operators, which at the next level are turned implemented at the engine.

## Engine Bindings and Engine Level Ops

When one creates new engine bindings, one is in essence defining

1. What the engine specific underlying structure for a DRM is (in Spark its an RDD). The underlying structure also has rows of MahoutVectors, so in Spark RDD[(index, MahoutVector)]. This will be important when we get to the native solvers.
2. Implementing a set of BLAS (basic linear algebra) functions for working on the underlying structure- in Spark this means implementing things like AtA on an RDD. See the sparkbindings on github

Now your mathematically expresive Samsara Scala code has been translated into optimized engine specific functions.

## Native Solvers

Recall how I said the rows of the DRMs are org.apache.mahout.math.Vector. Here is where this becomes important. I’m going to explain this in the context of Spark, but the principals apply to all distributed backends.

If you are familiar with how mapping and reducing in Spark, then envision this RDD of MahoutVectors, each partition, and indexed collection of vectors is a block of the distributed matrix, however this block is totally incore, and therefor is treated like an in core matrix.

Now Mahout defines its own incore BLAS packs and refers to them as Native Solvers. The default native solver is just plain old JVM, which is painfully slow, but works just about anywhere.

When the data gets to the node and an operation on the matrix block is called. In the same way Mahout converts abstract operators on the DRM that are implemented on various distributed engines, it calls abstract operators on the incore matrix and vectors which are implemented on various native solvers.

The default “native solver” is the JVM, which isn’t native at all- and if no actual native solvers are present operations will fall back to this. However, IF a native solver is present (the jar was added to the notebook), then the magic will happen.

Imagine still we have our Spark executor- it has this block of a matrix sitting in its core. Now let’s suppose the ViennaCl-OMP native solver is in use. When Spark calls an operation on this incore matrix, the matrix dumps out of the JVM and the calculation is carried out on all available CPUs.

In a similar way, the ViennaCL native solver dumps the matrix out of the JVM and looks for a GPU to execute the operations on.

Once the operations are complete, the result is loaded back up into the JVM, and Spark (or whatever distributed engine) and shipped back to the driver.

The native solver operatoins are only defined on org.apache.mahout.math.Vector and org.apache.mahout.math.Matrix, which is why it is critical that the underlying structure composed row-wise of Vector or Matrices.