Mahout MapReduce Overview

Getting Mahout

Download the latest release

Download the latest release here.

Or checkout the latest code from here

Alternatively: Add Mahout 0.10.0 to a maven project

Mahout is also available via a maven repository under the group id org.apache.mahout. If you would like to import the latest release of mahout into a java project, add the following dependency in your pom.xml:

<dependency>
    <groupId>org.apache.mahout</groupId>
    <artifactId>mahout-mr</artifactId>
    <version>0.10.0</version>
</dependency>

Features

For a full list of Mahout's features see our Features by Engine page.

Using Mahout

Mahout has prepared a bunch of examples and tutorials for users to quickly learn how to use its machine learning algorithms.

Recommendations

Check the Recommender Quickstart or the tutorial on creating a userbased recommender in 5 minutes.

If you are building a recommender system for the first time, please also refer to a list of Dos and Don'ts that might be helpful.

Clustering

Check the Synthetic data example.

Classification

If you are interested in how to train a Naive Bayes model, look at the 20 newsgroups example.

If you plan to build a Hidden Markov Model for speech recognition, the example here might be instructive.

Or you could build a Random Forest model by following this quick start page.

Working with Text

If you need to convert raw text into word vectors as input to clustering or classification algorithms, please refer to this page on how to create vectors from text.