Tuesday, December 11, 2012

Big Data and Real-time Scoring with ADAPA, the Universal PMML Scoring Engine

When first released, ADAPA (Adaptive Decision And Predictive Analytics) was purely a scoring engine, used to produce scores out of data mining models expressed in PMML (Predictive Model Markup Language) format. More recently, however, with the addition of a rules engine to its core, ADAPA is able to seamlessly combine rules and predictive models, which enables businesses to manage and design automated decisioning systems. In this way, ADAPA allows for the concretization of Enterprise Decision Management (EDM) solutions.

PMML Support and Predictive Analytics


Predictive analytics comprises a series of modeling techniques which can be used to extract relevant patterns present in large amounts of data to better predict the future.

ADAPA is able to generate scores out of a variety of predictive modeling techniques expressed in PMML. PMML provides a standard way for the expression of predictive models. In this way, proprietary issues and incompatibilities are no longer a barrier to the exchange of models between applications.

Currently, ADAPA supports the following PMML elements:
    • Multinomial Logistic
    • General Linear
    • Ordinal Multinomial
    • Simple Regression
    • Generalized linear model
    • Cox Regression Models
  • Multiple Models: Model Composition, Ensembles, Segmentation, and Chaining (including Random Forest Models).
as well as a variety of elements involved in data pre- and post-processing:
  • Text Mining
  • Regular Expressions
  • Built-in Functions (logic and arithmetic operators as well as conditional logic)
  • Normalization
  • Discretization
  • Value Mapping
  • Custom Functions
  • Targets/Scaling
  • Outputs (including business decisions and thresholds)
  • Model Verification (which in ADAPA can also take the form of a CSV file)
Once a model is uploaded in ADAPA, it can be executed in batch and real-time. ADAPA is a PMML consumer, therefore it is able to execute PMML code exported from tools such as R, IBM SPSS, SAS, KNIME, KXEN, STATISTICA, BigML, RapidMiner, etc.


ADAPA To Go


PMML Conversion


ADAPA provides its users with the ability to automatically convert older PMML models (versions 2.0, 2.1, 3.0, 3.1, 3.2, 4.0) to version 4.2. Besides schema validation, the conversion process also corrects known issues with PMML code from several sources/vendors. The aim is to successfully validate code in older versions of PMML and convert them to PMML 4.2. 

Transformations Generator


PMML provides a variety of data transformations, including value mapping, normalization, and discretization. It also offers several built-in functions as well as arithmetic and logical operators which can be combined to represent complex pre-processing steps. With the Transformations Generator tool, one can graphically design a transformation and obtain the respective PMML code. This can then be pasted into an existing PMML file and uploaded in ADAPA.

Software as a Service on the Cloud (Amazon EC2)


ADAPA predictive analytics is available through the Amazon Elastic Computing Cloud (Amazon EC2). It provides the first SaaS (Software as a Service) predictive decisioning platform. The user can upload and manage several rule sets as well as models expressed in PMML and score data in real-time through the use of web-service calls (ADAPA will automatically convert older versions of PMML to version 4.2 and correct any known issues from different vendors). ADAPA as a Service empowers people, since it allows for anyone anywhere to deploy and use state of the art data mining models.

ADAPA Add-in for Microsoft Office Excel


To make the process of executing predictive models even simpler, Zementis also offers the ADAPA add-in for Excel 2007, 2010, and 2013 (available for free). With the add-in, anyone in the enterprise is able to score data in Excel by executing models previously deployed in the Cloud.

ADAPA allows for real-time data scoring at any time a new event occurs since it can be used from inside any application via Web Service Calls. Excel is just one such application which happens to be a very well known tool (used by many). This is remarkable, since it frees users from having to deal with all the technology required for scoring their data whenever necessary. With the Excel add-in, all one has to do is to select which data records to score (or the columns and rows containing the relevant data) and pressing on the “Score” button in Excel … et voila’ … new predictions are generated automatically for all selected records.

ADAPA Flavors


ADAPA is currently being offered in three ways:
  • In the Amazon Cloud: launch your own private instances of ADAPA on Amazon EC2.
  • On Site: ADAPA is also available for deployment on site or on your private cloud. 

In-Database Scoring


Built on the heritage of the ADAPA Decision Engine, the Universal PMML Plug-in (UPPI) is a highly optimized, in-database scoring engine for predictive models, fully supporting the PMML standard. With PMML, UPPI delivers a wide range of predictive analytics for high performance scoring. It shortens time to market for predictive models and empowers users through instant deployment of predictive models.


UPPI is available for the following platforms:

Scoring for Hadoop


Zementis and Datameer have partnered to deliver first-ever standards-based execution of predictive analytics on a massive parallel scale. This joint solution combines the Zementis Universal PMML Scoring Engine for real-time execution of predictive models with the power and scale of Datameer, an end-to-end BI solution that includes data source integration, an analytics engine, visualization and dashboarding.

UPPI for Datameer brings together essential technologies, offering the best combination of open standards and scalability for the application of predictive analytics. The Plug-in fully supports the Predictive Model Markup Language (PMML), the de facto standard for data mining applications, which enables the integration of predictive models from IBM SPSS, SAS, R, and many more.

UPPI is also available for Hadoop/Hive. For more information see the UPPI for Hadoop page.

References

Resources

  • Zementis Support - Help desk and support forums providing support information for PMML, ADAPA, and the Universal PMML Plug-in (UPPI).
  • ADAPA product page - contains information about ADAPA on the Cloud, on Site, and the add-in for Excel.
  • Deploy! Newsletter - monthly newsletter containing the latest news on ADAPA and predictive analytics.
  • PMML - PMML resources page including examples.
  • PMML Tools - The Transformations Generator.
  • Videos - webinars and on-line video tutorials about model deployment, ADAPA, Excel add-in, PMML, ...
  • Data Mining Group (DMG) - describes PMML, the Predictive Modeling Markup Language, as well as gives information on all the companies currently supporting the standard.
  • Drools homepage
  • PMML 4.2 is here! - gives a short summary of the new features of the latest release of PMML.
  • PMML in Action (2nd Edition) - PMML book available on Amazon (Paperback and Kindle).
  • PMML Presentation - video of Dr. Alex Guazzelli's PMML presentation for the ACM Data Mining Group at LinkedIn.

No comments:






Copyright © 2009-2014 Zementis Incorporated. All rights reserved.

Privacy - Terms Of Use - Contact Us