Friday, December 9, 2011
Friday, December 2, 2011
Tuesday, October 25, 2011
Luckily, the advent of PMML (Predictive Model Markup Language) changed this scenario radically. PMML is the de facto standard used to represent predictive solutions. In this way, there is no need for scientists to write a word document describing the solution. They can just export it as a PMML file. Today, all major data mining tools and statistical packages support PMML. These include IBM SPSS, SAS, R, KNIME, RapidMiner, KXEN, ... Also, tools such as the Zementis Transformations Generator and KNIME allow for easy PMML coding for pre- and post-processing steps.
Great! Once a PMML file exists, it can be easily deployed in production with ADAPA, the Zementis scoring engine. ADAPA even allows for models to be deployed in the Amazon Cloud and be accessed from anywhere via web-services. Zementis also offers in-database scoring via its Universal PMML Plug-in, which is also available for Hadoop. In this way, a process that could take 6 months, now takes minutes.
PMML and ADAPA have transformed model deployment forever. If you or your company are still spending time and resources in deploying your predictive analytics the traditional way, make sure to contact us. The secret behind exceptional predictive analytics is out!
Friday, April 22, 2011
Once a predictive model is exported from a PMML-compliant tool such as SAS EM, SPSS/IBM, R, KNIME, RapidMiner, ... it can be uploaded directly into the Zementis ADAPA engine which makes the model available for execution via its console or as a web-service. ADAPA can already import most of the techniques defined by the PMML standard and now, with the release of ADAPA 3.4, we have expanded it even further to cover Association Rules.
Analysts always want to explore rules and relations between variables in large data sets. The learning mechanism of Association Rules serves this purpose. The rules discovered by Association Rules often provide useful information for marketing activities. For example, they can be used for discovering relations between products in transaction data in supermarkets. In this way, an association rule can be found to indicate that if a customer purchases beef and cat food together, he/she is most likely to also buy tuna cans.
An Association Rule Model in PMML is represented by the element "AssociationModel". ADAPA and PMML support two different formats for representing Association Rules. These are "rectangular" and "transactional". To learn more about these two formats, please read our posting: Association Rules in ADAPA.
In addition, with the release of ADAPA 3.4, we were able to make ADAPA even better when it comes to converting and correcting PMML files. This is yet another big step towards true interoperability. In many cases, even if the model has syntactic or semantic problems, ADAPA automatically corrects known issues for models exported from several model development environments. For that, we analyze PMML files submitted to us by our partners and clients.
If for any reason, your PMML code cannot be converted or corrected automatically, feel free to contact us. We are here to help!
Wednesday, April 20, 2011
What do you mean by standards-based?
ADAPA executes predictive models represented in PMML (Predictive Model Markup Language). PMML is the standard for representing predictive models currently exported from all major commercial and open-source data mining tools.
With PMML, you can basically build your model in IBM SPSS, SAS, R, KNIME, ... export it as a PMML file and upload it in ADAPA. Once you do that, your model is ready to be used from anywhere via web-services. You can even execute your models directly from within Excel.
Is ADAPA really fast?
ADAPA is very fast. We recently published a study on the ACM SIGKDD Newsletter in which we show that ADAPA can easily score thousands of transactions per second. In the High-CPU Extra-Large instance, ADAPA can score 300 million transactions per hour. FAST!
What kind of models does it support?
Modeling techniques currently supported are:
- Neural Networks
- Association Rules
- Support Vector Machines
- Naive Bayes Classifiers
- Ruleset Models
- Clustering Models (including Two-Step Clustering)
- Decision Trees
- Regression Models (including Cox Regression Models)
- Multiple Models (ensemble, composition, and segmentation)
How about data pre- and post-processing?
ADAPA transforms your raw data into meaningful feature detectors before scoring it. It post-processes the output of your predictive model so that it conforms to your requirements. ADAPA supports all the PMML built-in functions and data manipulations (as well as user defined functions). To learn more about how to represent pre- and post-processing operations in PMML, please take a look at our PMML data manipulation primer or simply contact us.
Can I combine predictive analytics with business rules?
ADAPA provides seamless integration of predictive analytics and rules. Simply put, ADAPA allows data driven insight and expert knowledge to be combined into a single and powerful decision strategy. That is because in addition of a sophisticated predictive analytics engine, ADAPA also incorporates the full functionality of a rules engine.
How do I pay for it? Is it expensive?
Once you sign up for ADAPA on the Cloud through Amazon.com, ADAPA charges show up on your credit card bill. Amazon handles all the billing. You can even use the same account you use to buy books. ADAPA on the Cloud does not cost an arm and a leg. Check out our pricing! And, the best part, you pay only for what you actually use.
Tuesday, April 19, 2011
Presented: Wednesday, April 13th, 2011
Presenters: Alex Guazzelli, Vice President - Analytics, Zementis Inc.
David Smith, Vice President - Marketing, Revolution Analytics
View the on-demand replay of the webinar
Download the webinar presentation
The rule in the past was that whenever a predictive model was built in a particular development environment, it remained in that environment forever, unless it was manually recoded to work somewhere else. This rule has been shattered with the advent of PMML (Predictive Modeling Markup Language). By providing a uniform standard to represent predictive models, PMML allows for the exchange of predictive solutions between different applications and various vendors.
In this joint webinar from Revolution Analytics and Zementis, you’ll learn:
- How to use data to create predictive models in the R language, with Revolution R Enterprise
- The purpose of the PMML standard, and predictive models it supports
- How to export predictive models from R using PMML
- How to score predictive models in PMML using ADAPA, from within Microsoft Excel and in the cloud
Download the whitepaper: Deploying Advanced Analytics Using R & PMML
Monday, April 18, 2011
Wednesday, April 6, 2011
Saturday, March 12, 2011
Developed by the Data Mining Group (DMG), PMML is supported by all major data mining vendors, e.g., IBM SPSS, SAS, Teradata, FICO, STASTICA, Microstrategy, TIBCO and Revolution Analytics as well as open source tools like R, KNIME and RapidMiner. With PMML, models built in any of these data mining tools can now instantly be deployed in the EMC Greenplum database. The net result is the ability to leverage the power of standards-based predictive analytics on a massive scale, right where the data resides.
"By partnering with Zementis, a true PMML innovator, we are able to offer a vendor-agnostic solution for moving enterprise-level predictive analytics into the database execution environment," said Dr. Steven Hillion, Vice President of Analytics at EMC Greenplum. "With Zementis and PMML, the de-facto standard for representing data mining models, we are eliminating the need to recode predictive analytic models in order to deploy them within our database. In turn, this enables an analyst to reduce the time to insight required in most businesses today."
Want to learn more?
To learn more about how the EMC Greenplum Database and the Universal PMML Plug-in work together, feel free to:
Contact us today for more information.
Thursday, March 10, 2011
Friday, February 11, 2011
Starting with data analysis and model development, you can effectively use the Predictive Model Markup Language (PMML) standard, to move complex decision models from the scientist's desktop into a scalable production environment hosted on the Amazon Elastic Compute Cloud (Amazon EC2).
Expressing Models in PMML
PMML is an XML-based language used to define predictive models. It was specified by the Data Mining Group (DMG), an independent group of leading technology companies including Zementis. By providing a uniform standard to represent such models, PMML allows for the exchange of predictive solutions between different applications and various vendors.
Open source statistical tools such as R can be used to develop data mining models based on historical data. R allows for models to be exported into PMML which can then be imported into an operational decision platform and be ready for production use in a matter of minutes.
On-Demand Predictive Analytics
Amazon EC2 is a reliable, on-demand infrastructure on which we offer the ADAPA Predictive Decisioning Engine based on the Software as a Service (SaaS) paradigm. ADAPA imports models expressed in PMML and executes these in batch mode, or real-time via web-services.
Our service is implemented as a private, dedicated Amazon EC2 instance of ADAPA. Each client has access to his/her own ADAPA instance via HTTP/HTTPS. In this way, models and data for one client never share the same engine with other clients.
Using a SaaS solution to break down traditional barriers that currently slow the adoption of predictive analytics, our strategy translates predictive models into operational assets with minimal deployment costs and leverages the inherent scalability of utility computing.
In summary, ADAPA allows for:
- Cost-effective and reliable service based on Amazon’s EC2 infrastructure
- Secure execution of predictive models through dedicated and controlled instances including HTTPS and Web-Services security
- On-demand computing. Choice of instance type (small, large, extra-large, ...) and launch of multiple instances.
- Superior time-to-market by providing rapid deployment of predictive models and an agile enterprise decision management environment.
Friday, January 28, 2011
The Predictive Model Markup Language (PMML) is the leading standard for representing statistical and data mining models. With PMML, it is straightforward to develop a model on one system using one application and deploy the model on another system using another application. PMML reduces complexity and bridges the gap between development and production deployment of predictive analytics.
PMML is governed by the Data Mining Group (DMG), an independent, vendor led consortium that develops data mining standards. PMML is currently supported by over 20 vendors and organizations and awareness as well as use of the standard is growing quickly. To establish a conduit in which people can come together to learn and discuss topics related to PMML, we have created a PMML interest group in LinkedIn. The group is going strong, with many PMML-related discussions and announcements every week. And, we are happy to announce that the group is now nearing 1,000 members!
To join the Predictive Model Markup Language (PMML) group on LinkedIn, please follow this link:
The group aims to serve as a central resource regarding the practical application of PMML, its benefits for business and IT. PMML increases business agility by eliminating the need for proprietary solutions or custom code development. For this reason, it is a critical element in the quest for business process optimization and automated, intelligent decisions.
We encourage active participation in the PMML group from the entire community, please post your questions! The group already contains postings related to
- The value of PMML for business and IT
- PMML powered products
- Links to a general introduction and overview presentation
Thursday, January 20, 2011
Model interfaces: pre- and post-processing
- Text Mining
- Value Mapping
- Regular Expressions
- Logical and Arithmetic Operators
- Lookup Tables
- Custom Functions
Automatic conversion (and correction) for older versions of PMML
Web-based management and interactive execution of predictive models and business rules
- Model verification: The ADAPA Console includes a model validation test, allowing models to be verified for correctness. By providing ADAPA a test file containing input data and expected results for a model, the engine will report any deviations from expected results, greatly enhancing traceability of errors and debugging of model deployment issues. The console also provides easy access to our rules testing framework in which business rules are submitted to regression testing and acceptance.
- Batch-scoring: The console also provides functionality to upload a (compressed) CSV data file and batch-scores it against any of the deployed models. Results are returned in the same format and may be downloaded for further processing and visualization.
Simplified integration via SOA
Data scoring from inside Excel
On-demand predictive analytics solution
Private instance for all your decisioning needs
Trusted, secure, scalable cloud infrastructure
Wednesday, January 19, 2011
Once a predictive model is exported from a PMML-compliant tool such as SAS EM, SPSS/IBM, R, KNIME, RapidMiner, ... it can be uploaded directly into the Zementis ADAPA engine which makes the model available for execution via its console or as a web-service. ADAPA can already import most of the techniques defined by the PMML standard and now, with the release of ADAPA 3.3, we have expanded it even further to cover Cox Regression and Ruleset models.
Cox Regression Models
Cox proportional hazards model of survival is used in various industries including pharmaceutical and telecommunications.
Ruleset models can be thought of as flattened decision tree models. A ruleset consists of a number of rules in which each rule contains a predicate and a predicted class value.
As of now, ADAPA supports the following modeling techniques:
- Ruleset Models
- Clustering Models (Distribution-Based, Center-Based, and 2-Step Clustering)
- Decision Trees (for classification and regression) together with multiple missing value handling strategies (Default Child, Last Prediction, Null Prediction, Weighted Confidence, Aggregate Nodes)
- Naive Bayes Classifiers
- Neural Networks (Back-Propagation, Radial-Basis Function, and Neural-Gas)
- Regression Models (Linear, Polynomial, and Logistic) and General Regression Models (General Linear, Ordinal Multinomial, Generalized Linear, Cox)
- Support Vector Machines (for regression and multi-class and binary classification)
- Scorecards (point allocation for categorical, continuous, and complex attributes)
Additionally, ADAPA supports a myriad of functions for implementing data pre- and post-processing. These include:
- Value Mapping
- Logical and Arithmetic Operators
and much much more
If you think of anything ADAPA cannot do or something else you need to do in terms of data manipulation, let us know.For your free trial of ADAPA, please register at: https://myadapa.zementis.com/
Thursday, January 13, 2011
Business Rules are ubiquitous today. They manage the day to day operations of thousands of companies worldwide. From stocking to maintenance, rules are an integral part of the way we do business in the 21st century. This kind of knowledge, know as Expert Knowledge is forged from years of experience, or what turned out to be the “logical thing to do”.
However, along with the information age, more and more data started being gathered all over the world about the processes and services we as a society came to benefit from. In this sea of data, predictive algorithms were designed to extract its hidden patterns, i.e. knowledge that is hidden from the human eye. This is known as Data-Driven Knowledge.
In an ideal world, business rules and predictive models live side by side benefiting from each other since both encode complementary types of knowledge.
In the presentation below, originally given at RulesFest 2010, Dr. Alex Guazzelli starts by differentiating the two types of knowledge. He then makes the point that companies can get Enhanced Decisioning whenever expert and data-driven knowledge are combined. Dr. Guazzelli goes on to describe the making of a predictive solution by using a "fish processing plant" as an analogy for any process that can benefit from intelligent decisioning. He ends by showing how such a solution can be deployed using PMML (the Predictive Model Markup Language) and easily moved to the production environment using ADAPA, the Zementis Predictive Decisioning Engine.