The Factorization Machine (FM) is a matrix-based machine learning algorithm proposed by Steffen Rendle. It can predict any real-valued vector. Its main advantages include: 1) capable of highly sparse data scenarios; 2) linear computational complexity.
- Factorization Machine Model
where: is the dot product of two k-dimensional vectors:
Model parameters are: whereindicates that the feature i is represented by k factors, and k is the hyperparameter that determines the factorization。
FM can be used for a series of predictive tasks, such as:
- classification: can be used directly as a predictor, and the optimization criterion is to minimize the least square difference。
- regression:can use the symbol of to do classification prediction,parameters are estimated by hinge function or logistic regression function at any time.。
FM algorithm model The model of the FM algorithm consists of two parts, wide and embedding, where wide is a typical linear model. The final output is the sum of the two parts of wide and embedding.
FM training process Angel implements the gradient descent method to optimize, iteratively trains the FM model, and the logic on each iteration of the worker and PS is as follows:
- worker:Pull the wide and embedding matrices from the PS to the local for each iteration, calculate the corresponding gradient update value, push to PS
- PS:PS summarizes the gradient update values pushed by all workers, averages them, calculates and updates the new wide and embedding models through the optimizer.
FM prediction result:
- format:rowID,pred,prob,label
- caption:rowID indicates the row ID of the sample, starting from 0; pred: the predicted value of the sample; prob: the probability of the sample relative to the predicted result; label: the category into which the predicted sample is classified, when the predicted result value pred is greater than 0, Label is 1, less than 0 is -1
data format support Libsvm and dummy two data formats, libsvm format is as follows:
1 1:1 214:1 233:1 234:1
dummy data format:
1 1 214 233 234
Parameter Description
- ml.epoch.num:Number of iterations
- ml.feature.index.range:Feature index range
- ml.model.size:Feature dimension
- set sampling rate
- type, "libsvm" and "dummy"
- ml.learn.rate:learning rate
- rate decay
- ml.opt.decay.on.batch: Whether to decay each mini batch
- ml.opt.decay.alpha: Learning rate decay parameter alpha
- ml.opt.decay.beta: Learning rate decay parameter beta
- ml.opt.decay.intervals: Learning rate decay parameter intervals
- ml.reg.l2:l2 regularization
- action.type:Task type,Training with "train", prediction with "predict"
- number of input data field
- of vector in embedding
- ml.inputlayer.optimizer:Optimizer type, optional "adam", "ftrl" and "momentum"
- Whether to convert the label, the default is "NoTrans", the options are "ZeroOneTrans" (converted to 0-1), "PosNegTrans" (converted to plus or minus 1), "AddOneTrans" (plus 1), "SubOneTrans" ( minus 1).
- "ZeroOneTrans" (turned to 0-1), "PosNegTrans" (turned to plus or minus 1) These two transitions are set to a threshold, greater than the threshold of 1, the threshold defaults to 0
- Positive and negative sample resampling ratio, useful for situations where the positive and negative samples differ greatly (eg, 5 times or more)
submit command The FM algorithm can be submitted by the following command or set construct a json file of compute network and run by json(see Json description:
$ANGEL_HOME/bin/angel-submit \
-Dml.epoch.num=20 \ \ \
-Dml.feature.index.range=$featureNum \
-Dml.model.size=$featureNum \ \ \
-Dml.learn.rate=0.1 \
-Dml.reg.l2=0.03 \
-Daction.type=train \$fielNum \ \
-Dml.inputlayer.optimizer=ftrl \$input_path \$model_path \
-Dangel.log.path=$log_path \
-Dangel.workergroup.number $workerNumber \ $workerMemory \
-Dangel.worker.task.number $taskNumber \ $PSNumber \ $PSMemory \
-Dangel.output.path.deleteonexist true \ $storageLevel \ $taskMemory \
-Dangel.worker.env "LD_PRELOAD=./" \ \
json file as follows:(see data)
"data": {
"format": "dummy",
"indexrange": 148,
"validateratio": 0.1,
"numfield": 13,
"sampleratio": 0.2
"train": {
"epoch": 5,
"lr": 0.8,
"decayclass": "WarmRestarts",
"decayalpha": 0.05
"model": {
"modeltype": "T_FLOAT_DENSE",
"modelsize": 148
"default_optimizer": {
"type": "momentum",
"reg2": 0.01
"layers": [
"name": "wide",
"type": "simpleinputlayer",
"outputdim": 1,
"transfunc": "identity"
"name": "embedding",
"type": "embedding",
"numfactors": 8,
"outputdim": 104
"name": "biinnersumcross",
"type": "BiInnerSumCross",
"inputlayer": "embedding",
"outputdim": 1
"name": "sumPooling",
"type": "SumPooling",
"outputdim": 1,
"inputlayers": [
"name": "simplelosslayer",
"type": "simplelosslayer",
"lossfunc": "logloss",
"inputlayer": "sumPooling"
*submit script
$ANGEL_HOME/bin/angel-submit \ fm \
--action.type train \ $runner \ $modelClass \ $input_path \ $model_path \
--angel.log.path $log_path \
--angel.workergroup.number $workerNumber \ $workerMemory \
--angel.worker.task.number $taskNumber \ $PSNumber \ $PSMemory \
--angel.output.path.deleteonexist true \ $storageLevel \ $taskMemory \
--angel.worker.env "LD_PRELOAD=./" \ $fm_json_path \