public interface IDMigrator extends Refreshable
Mahout 0.2 changed the framework to operate only in terms of numeric (long) ID values for users and items.
This is, obviously, not compatible with applications that used other key types -- most commonly
String
. Implementation of this class provide support for mapping String to longs and vice versa in
order to provide a smoother migration path to applications that must still use strings as IDs.
The mapping from strings to 64-bit numeric values is fixed here, to provide a standard implementation that
is 'portable' or reproducible outside the framework easily. See toLongID(String)
.
Because this mapping is deterministically computable, it does not need to be stored. Indeed, subclasses' job is to store the reverse mapping. There are an infinite number of strings but only a fixed number of longs, so, it is possible for two strings to map to the same value. Subclasses do not treat this as an error but rather retain only the most recent mapping, overwriting a previous mapping. The probability of collision in a 64-bit space is quite small, but not zero. However, in the context of a collaborative filtering problem, the consequence of a collision is small, at worst -- perhaps one user receives another recommendations.
Modifier and Type | Method and Description |
---|---|
long |
toLongID(String stringID) |
String |
toStringID(long longID) |
refresh
long toLongID(String stringID)
String
's UTF-8 encoding as a
long.String toStringID(long longID) throws TasteException
TasteException
- if an error occurs while retrieving the mappingCopyright © 2008–2017 The Apache Software Foundation. All rights reserved.