Skip to content

prog012/jpmml-evaluator

Repository files navigation

JPMML-Evaluator Build Status

Java Evaluator API for Predictive Model Markup Language (PMML).

Features

JPMML-Evaluator is de facto the reference implementation of the PMML specification for the Java platform:

  1. Pre-processing of active fields according to the [DataDictionary] (http://www.dmg.org/v4-2-1/DataDictionary.html) and [MiningSchema] (http://www.dmg.org/v4-2-1/MiningSchema.html) elements:
  • Complete data type system
  • Complete operational type system.
  • Treatment of outlier, missing and/or invalid values.
  1. Model evaluation:
  1. Post-processing of target fields according to the [Targets] (http://www.dmg.org/v4-2-1/Targets.html) element:
  • Rescaling and/or casting regression results.
  • Replacing a missing regression result with the default value.
  • Replacing a missing classification result with the map of prior probabilities.
  1. Calculation of auxiliary output fields according to the [Output] (http://www.dmg.org/v4-2-1/Output.html) element:
  • Over 20 different result feature types.
  1. Model verification according to the [ModelVerification] (http://www.dmg.org/v4-2-1/ModelVerification.html) element.

For more information please see the [features.md] (https://github.com/jpmml/jpmml-evaluator/blob/master/features.md) file.

JPMML-Evaluator has been tested with popular open-source PMML producer software:

JPMML-Evaluator is thread safe and can easily deliver over one million scorings per second (on a single quad-core CPU) when working with simpler models.

Prerequisites

  • Java 1.7 or newer

Installation

JPMML-Evaluator library JAR files (together with accompanying Java source and Javadocs JAR files) are released via [Maven Central Repository] (http://repo1.maven.org/maven2/org/jpmml/). Please join the [JPMML mailing list] (https://groups.google.com/forum/#!forum/jpmml) for release announcements.

The current version is 1.2.5 (17 September, 2015).

<dependency>
	<groupId>org.jpmml</groupId>
	<artifactId>pmml-evaluator</artifactId>
	<version>1.2.5</version>
</dependency>

Usage

JPMML-Evaluator depends on the [JPMML-Model] (https://github.com/jpmml/jpmml-model) library for PMML class model.

Loading a PMML schema version 3.X or 4.X document into an org.dmg.pmml.PMML instance:

PMML pmml;

InputStream is = ...;

try {
	Source transformedSource = ImportFilter.apply(new InputSource(is));

	pmml = JAXBUtil.unmarshalPMML(transformedSource);
} finally {
	is.close();
}

If the model type is known, then it is possible to instantiate the corresponding subclass of org.jpmml.evaluator.ModelEvaluator directly:

PMML pmml = ...;

ModelEvaluator<TreeModel> modelEvaluator = new TreeModelEvaluator(pmml);

Otherwise, if the model type is unknown, then the model evaluator instantiation work should be delegated to an instance of class org.jpmml.evaluator.ModelEvaluatorFactory:

PMML pmml = ...;

ModelEvaluatorFactory modelEvaluatorFactory = ModelEvaluatorFactory.newInstance();
 
ModelEvaluator<?> modelEvaluator = modelEvaluatorFactory.newModelManager(pmml);

Model evaluator classes follow functional programming principles and are completely thread safe.

Model evaluator instances are fairly lightweight, which makes them cheap to create and destroy. Nevertheless, long-running applications should maintain a one-to-one mapping between PMML and ModelEvaluator instances for better performance.

It is advisable for application code to work against the org.jpmml.evaluator.Evaluator interface:

Evaluator evaluator = (Evaluator)modelEvaluator;

An evaluator instance can be queried for the definition of active (ie. independent), target (ie. primary dependent) and output (ie. secondary dependent) fields:

List<FieldName> activeFields = evaluator.getActiveFields();
List<FieldName> targetFields = evaluator.getTargetFields();
List<FieldName> outputFields = evaluator.getOutputFields();

The PMML scoring operation must be invoked with valid arguments. Otherwise, the behaviour of the model evaluator class is unspecified.

The preparation of field values:

Map<FieldName, FieldValue> arguments = new LinkedHashMap<FieldName, FieldValue>();

List<FieldName> activeFields = evaluator.getActiveFields();
for(FieldName activeField : activeFields){
	// The raw (ie. user-supplied) value could be any Java primitive value
	Object rawValue = ...;

	// The raw value is passed through: 1) outlier treatment, 2) missing value treatment, 3) invalid value treatment and 4) type conversion
	FieldValue activeValue = evaluator.prepare(activeField, rawValue);

	arguments.put(activeField, activeValue);
}

The scoring:

Map<FieldName, ?> results = evaluator.evaluate(arguments);

Typically, a model has exactly one target field:

FieldName targetName = evaluator.getTargetField();

Object targetValue = results.get(targetName);

The target value is either a Java primitive value (as a wrapper object) or an instance of org.jpmml.evaluator.Computable:

if(targetValue instanceof Computable){
	Computable computable = (Computable)targetValue;

	Object primitiveValue = computable.getResult();
}

The target value may implement interfaces that descend from interface org.jpmml.evaluator.ResultFeature:

// Test for "entityId" result feature
if(targetValue instanceof HasEntityId){
	HasEntityId hasEntityId = (HasEntityId)targetValue;
	HasEntityRegistry<?> hasEntityRegistry = (HasEntityRegistry<?>)evaluator;
	BiMap<String, ? extends Entity> entities = hasEntityRegistry.getEntityRegistry();
	Entity winner = entities.get(hasEntityId.getEntityId());

	// Test for "probability" result feature
	if(targetValue instanceof HasProbability){
		HasProbability hasProbability = (HasProbability)targetValue;
		Double winnerProbability = hasProbability.getProbability(winner.getId());
	}
}

Example applications

Module pmml-evaluator-example exemplifies the use of the JPMML-Evaluator library.

This module can be built using [Apache Maven] (http://maven.apache.org/):

mvn clean install

The resulting uber-JAR file target/example-1.2-SNAPSHOT.jar contains the following command-line applications:

Evaluating model model.pmml by loading input data records from input.tsv and storing output data records to output.tsv:

java -cp target/example-1.2-SNAPSHOT.jar org.jpmml.evaluator.EvaluationExample --model model.pmml --input input.tsv --output output.tsv

Optimizing model model.pmml by appling a list of four Visitor classes to it. The JVM option javaagent loads the [JPMML agent] (https://github.com/jpmml/jpmml-model/tree/master/pmml-agent), which provides memory usage measurement functionality:

java -javaagent:pmml-agent-1.2-SNAPSHOT.jar -cp target/example-1.2-SNAPSHOT.jar org.jpmml.evaluator.OptimizationExample --model model.pmml --visitor-classes org.jpmml.model.visitors.LocatorNullifier,org.jpmml.model.visitors.ArrayListOptimizer,org.jpmml.model.visitors.StringInterner,org.jpmml.evaluator.visitors.PredicateInterner

Documentation

The [Openscoring.io blog] (http://openscoring.io/blog/) contains fully worked out examples about using JPMML-Model and JPMML-Evaluator libraries.

Recommended reading:

License

JPMML-Evaluator is dual-licensed under the [GNU Affero General Public License (AGPL) version 3.0] (http://www.gnu.org/licenses/agpl-3.0.html) and a commercial license.

Additional information

Please contact [[email protected]] (mailto:[email protected])

About

Java Evaluator API for PMML

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 98.0%
  • R 2.0%