What is Apache Mahout?

The Apache Mahout™ project's goal is to build an environment for quickly creating scalable preformant machine learning applications.

Latest release version 0.10.0 has

Mahout Samsara Environment
  • Distributed Algebraic optimizer
  • R-Like DSL Scala API
  • Linear algebra operations
  • Ops are extensions to Scala
  • IScala REPL based interactive shell
  • Integrates with compatible libraries like MLLib
  • Run on distributed Spark
  • H2O in progress
Mahout Samsara based Algorithms
  • Stochastic Singular Value Decomposition (ssvd, dssvd)
  • Stochastic Principal Component Analysis (spca, dspca)
  • Distributed Cholesky QR (thinQR)
  • Distributed regularized Alternating Least Squares (dals)
  • Collaborative Filtering: Item and Row Similarity
  • Naive Bayes Classification
  • Distributed and in-core

The three major components of Mahout are an environment for building scalable algorithms, many new Scala + Spark (H2O in progress) algorithms, and Mahout's mature Hadoop MapReduce algorithms.

11 Apr 2015 - Apache Mahout's next generation version 0.10.0 released

Apache Mahout introduces a new math environment we call Samsara, for its theme of universal renewal. It reflects a fundamental rethinking of how scalable machine learning algorithms are built and customized. Mahout-Samsara is here to help people create their own math while providing some off-the-shelf algorithm implementations. At its core are general linear algebra and statistical operations along with the data structures to support them. You can use is as a library or customize it in Scala with Mahout-specific extensions that look something like R. Mahout-Samsara comes with an interactive shell that runs distributed operations on a Spark cluster. This make prototyping or task submission much easier and allows users to customize algorithms with a whole new degree of freedom.

Mahout Algorithms include many new implementations built for speed on Mahout-Samsara. They run on Spark and some on H2O, which means as much as a 10x speed increase. You’ll find robust matrix decomposition algorithms as well as a Naive Bayes classifier and collaborative filtering. The new spark-itemsimilarity enables the next generation of cooccurrence recommenders that can use entire user click streams and context in making recommendations.

Interested in helping? Join the Mailing lists.

Mahout News

1 February 2014 - Apache Mahout 0.9 released

Visit our release notes page for details.

25 July 2013 - Apache Mahout 0.8 released

Visit our release notes page for details.

16 June 2012 - Apache Mahout 0.7 released

Visit our release notes page for details.

6 Feb 2012 - Apache Mahout 0.6 released

Visit our release notes page for details.

9 Oct 2011 - Mahout in Action released

The book Mahout in Action is available in print. Sean Owen, Robin Anil, Ted Dunning and Ellen Friedman thank the community (especially those who were reviewers) for input during the process and hope it is enjoyable.

Find it at your favorite bookstore, or order print and eBook copies from Manning -- use discount code "mahout37" for 37% off.