Thursday, December 23, 2010
Wednesday, December 15, 2010
AWS Achieves PCI DSS Level 1 Compliance and ISO 27001 Certification
Just recently, Amazon Web Services (AWS) has achieved Level 1 PCI compliance and ISO 27001 certification on its infrastructure, data centers, and services. Basically, AWS is now a validated PCI Service Provider which means that merchants and other service providers can become Payment Card Industry (PCI) certified while storing, processing, and transmitting credit card information using AWS.
For more information on PCI compliance as well as ISO 27001 certification, please read the Amazon Newsletter.
Benefiting from Predictive Analytics on the Cloud
ADAPA (Adaptive Decision And Predictive Analytics) is a predictive decisioning platform. It combines the power of predictive analytics and business rules to facilitate the tasks of managing and designing automated decisions systems.
As a scoring engine, ADAPA supports the PMML standard. In this way, different data mining models can be uploaded into the engine and executed in real-time or batch mode.
ADAPA on the Amazon Cloud
ADAPA is the first standards-based, real-time scoring engine available on the market and the first scoring engine accessible on the Amazon Cloud as a service. ADAPA on the Cloud combines the benefits of Software as a Service (SaaS), the scalability of cloud computing and the extensive feature set of ADAPA on Site.
By utilizing the ADAPA Control Center, you can launch and terminate a new ADAPA instance on the Amazon Cloud in minutes which allows you to quickly scale capacity, both up and down, as your computing requirements change.
Finally, by offering ADAPA on the Amazon Cloud as a service changes the economics of predictive analytics by allowing you to pay only for computing that you actually use. What a concept ... huh?
Friday, December 3, 2010
Friday, November 19, 2010
With this new release, we are making ADAPA compliant with PMML 4.0, the latest version of the Predictive Model Markup Language.
PMML is the de-facto standard to represent data mining models. It allows for predictive models to be built in any of the top model building environments (commercial and open-source) and easily deployed in ADAPA, the Zementis decisioning engine platform.
Besides offering compliance with PMML 4.0, ADAPA is also able to automatically convert old PMML files (versions 2.0, 2.1, 3.0, 3.1, and 3.2) to 4.0. That's because it incorporates a PMML conversion process, which is able to validate, correct and convert files to PMML 4.0 automatically.
When models are deployed in ADAPA, they are made available right-away for execution in batch mode or real-time. Models can be accessed from anywhere in the enterprise via web services, the ADAPA web console, or Excel.
Compliance with PMML 4.0 allows ADAPA to officially support the following 4.0 features:
- Multi-class classification for Support Vector Machines (SVMs) for either one-against-one or one-against-all.
- Extended built-in functions for pre- and post-processing of input data:
- Functions for boolean operations: isMissing, isNotMissing, equal, notEqual, lessThan, lessOrEqual, greaterThan, greaterOrEqual, and, or, not, isIn, isNotIn
- IF-THEN-ELSE construct
Zementis is constantly working to make ADAPA even better. This new release is a testament to that.
Friday, September 10, 2010
Would you like to take PMML for a spin? Do you have PMML files you would like to test-drive?
ADAPA on the Amazon Cloud, our PMML-based scoring engine, is now available as a free trial. Just register, log in and have fun! Go to:
Feel free to export PMML from your favorite tool and give it a try. ADAPA checks if PMML files are syntactically and semantically sound and if not, it corrects them automatically (if possible). For more info, click HERE.
The free trial also gives you access to the Excel add-in which allows for data scoring directly from within Excel (2007 and 2010).
ADAPA free trial, the easiest way to try PMML!
Friday, September 3, 2010
The PMML Converter is now part of ADAPA!
With the release of ADAPA 3.1, we were able to make ADAPA even better when it comes to converting and correcting PMML files. That's because ADAPA now incorporates the PMML Converter which allows it to consume older versions of PMML (2.0 onwards) and convert them internally to the latest PMML version. This is yet another big step towards true interoperability.
When converting a PMML file, ADAPA validates the file against the PMML specification in two levels:
1) Syntactically: this phase works similarly to a spell checker. Its goal is to make sure that the PMML file conforms to the XSD schema. It answer the question: Is the PMML file grammatically sound?
2) Semantically: given that the file is grammatically correct, does it make sense? Is it trying to sum up strings? If so, it may not be a valid model after all. This is a very important phase and crucial for a model to produce scores and probabilities that make sense.
In many cases, even if the model has syntactic or semantic problems, ADAPA automatically corrects known issues for models exported from several model development environments. For that, we peered through thousands of PMML files submitted to us by our partners and clients.
ADAPA is also able to verify that a model has been uploaded correctly. This can be done either via the PMML file itself with the use of the Model Verification element, or via the ADAPA Web Console. In this case, all you need to do is upload a data file with expected results and ADAPA will automatically perform the scoring match test. For more on model verification, please click HERE.
The aim of release 3.1 is to make it extremely easy for users to upload PMML files into ADAPA, a major step towards true interoperability.
If for any reason, your PMML code cannot be converted or corrected automatically, feel free to contact us. We are here to help!
Monday, August 2, 2010
Predictive maintenance solutions are based on the idea that one is able to know that a machine or equipment is going to fail, and take proactive actions to ensure process reliability and safety. By using data from sensors that capture vibration information from rotating equipment, my team built a predictive maintenance solution that alerted personnel of eminent breakdowns. For that, we used a combination of statistical tools. For example, we used R, an open-source statistical package for data analysis, IBM SPSS Statistics for analysis and model building, and the Zementis ADAPA platform for model deployment. Since all these systems support PMML, the Predictive Model Markup Language, instead of spending time translating code from one system to another, we were able to concentrate on the problem itself and use the tools we trusted the most to get the job done.
PMML is the de facto standard used to represent predictive analytic or data mining models. With PMML, a predictive solution may be built in one system and deployed in another where it can be put to work immediately. The adoption of PMML by the major analytic vendors is a testimony to their commitment to interoperability and the advancement of predictive analytics as a critical factor to the betterment of society. PMML is developed by the Data Mining Group (DMG), a committee composed not only by commercial and open-source analytic companies including IBM, SAS, Zementis, Microstrategy, KNIME and Rapid-I, but also by analytic users such as NASA, Visa, and Equifax.
In the wake of the gulf tragedy, predictive analytics and open standards can provide yet another tool for safe guarding operations and ensuring safety and process reliability. While predictive analytics can offer solutions to alert us of problems before they actually happen, open standards such as PMML are key ingredients for ensuring that the building and deployment of predictive maintenance solutions is application independent and so agile and transparent.
We recently wrote a series of two articles for the IBM developerWorks website that covers PMML and predictive maintenance. To read both articles in their entirety, please refer to the following links:
1)What is PMML? Explore the power of predictive analytics and open standards
2)Representing predictive solutions in PMML: Move from raw data to predictions
Tuesday, June 29, 2010
Friday, June 4, 2010
It is obvious that open source tools for predictive analytics are gaining more and more momentum. Just last month, the KDNuggets.com website ran a poll which asked visitors to vote for the tools they had used for a real project in the last 12 months. The result ... more than 50% of the respondents said they used open-source tools such as R, KNIME, and RapidMiner. Now, it may be that the responses are really not that representative of the entire data mining community, but they do reflect a trend: open source data mining tools are here to stay and their use is growing as they become better and easier to use. And, guess what? Their support for PMML is stronger than ever.
Rapid-I has just released an extension for RapidMiner offering the export of PMML for several modeling techniques. KNIME continues to expand its PMML support with new capabilities ... and Weka has just announced support for Support Vector Machines (in addition to several other PMML elements). Augustus from Open Data supports segmented models expressed in PMML. The Zementis PMML Converter tool, which unifies the different versions of PMML into a single version is now also a corrector and supports PMML 2.0 through 4.1.
There is also news from commercial vendors, Pervasive has announced support for PMML in DataRush. The Zementis ADAPA decisioning platform which is available as a service on the Amazon Cloud now offers the seamless integration of models expressed in PMML and business rules.
Last but not least, the Data Mining Group (DMG), which is responsible for developing PMML is constantly expanding. NASA has recently joined the DMG, so chances are PMML will grow out of this world and into the stars. According to the DMG website (dmg.org):
"The NASA Data Mining and Trending Working Group (DMTWG) was established to strengthen data/text mining and trending within and across NASA datasets, to aid in the identification and mitigation of adverse trends and to raise the awareness and capability of data/text mining within the agency. "
From down under, Togaware has also joined the DMG. Togaware's most well known product is Rattle, an open-source data mining tool built on top of R that produces and consumes PMML. Togaware maintains the PMML package for R which can be obtained from CRAN, the free "app store" for R users.
Wednesday, May 26, 2010
PMML (Predictive Model Markup Language) is the de facto standard used to represent and share predictive analytic solutions between applications. This enables data mining scientists and users alike to easily build, visualize, and deploy their solutions using different platforms and systems. This book presents PMML from a practical perspective. It contains a variety of code snippets so that concepts are made clear through the use of examples.
PMML in Action is a great way to learn how to represent your predictive models through a mature open standard. The book is divided into six parts, taking you in a PMML journey in which language elements and attributes are used to represent not only modeling techniques but also data transformations.
With PMML, users benefit from a single and concise standard to represent data and models, thus avoiding the need for custom code and proprietary solutions.
You too can join the PMML movement! Unleash the power of predictive analytics and data mining today! Available now on Amazon.com.
"The very first book that covers the industry standard for transferring and integrating predictive models across systems, this is a milestone for predictive analytics. If you want the long and short on engineering for versatility in how predictive models can be deployed and put to work, get started by curling up with this book."
Eric Siegel, Ph.D.
President, Prediction Impact, Inc.
Conference Chair, Predictive Analytics World
"Open standards facilitate innovation and progress (web is a great example). PMML (the Predictive Model Markup Language) is an open standard for predictive analytics and data mining, developed over more than 12 years and supported by most industry leaders. This easy to read book covers data transformations, many modeling methods (Associations, Clustering, Decision Trees, Neural Nets, Regression, SVM, and more), model ensembles, and verification. This book is your essential guide to PMML !"
Gregory Piatetsky, Ph.D.
Editor KDNuggets, Founder KDD/SIGKDD
"Next generation enterprise are going to be driven by analytics, especially predictive analytics. Sharing and rapidly deploying predictive analytic models is essential and PMML is the open standard that delivers the interoperability and agility that these predictive enterprises need."
CEO, Decision Management Solutions
Co-author of “Smart (Enough) Systems: How to Deliver Competitive Advantage by Automating Hidden Decisions ”
“PMML in Action may be destined to become an analog to the famous Kernighan and Richie book, "The C Programming Language", published in 1978. This book (affectionately known as K&R) became the standard guide for ANSII C programming practice. I expect that "PMML in Action" will function likewise in the burgeoning development of PMML in analytical tools now, and in the future. It is the "cookbook" for PMML programming. Julia Child made French cuisine kiss-simple for housewives to create. Now, programmers can follow the descriptions and practices in this book to implement analytical solutions in PMML as easily and efficiently as Julia enabled a housewife to make a French soufflé."
Robert A. Nisbet, Ph.D.
Co-author of “Handbook of Statistical Analysis & Data Mining Applications”
Thursday, May 20, 2010
ADAPA 3.0 on the Amazon Cloud is also available in three new high-memory instance types. In addition, all types (old and new) can now be launched in the US, EU, and Asia-Pacific regions.
All the new and exciting features in this 3.0 release are described below.
Predictive Analytics + Business Rules on the Cloud
With ADAPA 3.0, the integration of predictive analytics and rules is seamless. Simply put, ADAPA allows both data-driven and expert knowledge to be combined into a single and concise solution, executed in real-time or in batch-mode. This feature was previously available for on site deployments of ADAPA, but now it also available for ADAPA on the Cloud!
Customized PMML Extensions on the Cloud
You can now easily create and upload to ADAPA your own custom predictive model components. Are you using a transformation not supported in PMML? Not a problem. Just write it in Java, upload it to ADAPA, and use it in your models just as you would use a standard PMML transformation.
- Model Verification: ADAPA now supports the PMML element for Model Verification. This feature adds yet another important PMML element to the already extensive list of elements supported by ADAPA. Model verification via a CSV file continues to be supported (click here for more).
- Data Types: Enhanced support for models with data type incompatibilities and value down-casting.
- PMML Conversion: Several PMML converter and corrector improvements, including support for TIBCO models. ADAPA 3.0 and its converter is also tested for compatible models from SAS, IBM SPSS, KXEN, KNIME, RapidMiner, R and Rattle, Microsoft SSIS, Statistica, Microstrategy, Pervasive DataRush, and more.
- Performance: A number of performance enhancements. ADAPA now is faster than ever!
Web Console Improvements
The web console of ADAPA 3.0 includes several functionality and usability improvements, such as:
- Ability to upload custom resources: These include non-standard PMML transformations, mapping or look-up tables, and custom data models used in business rules.
- Dependencies tracking: You are now able to see the dynamic tracking and update of dependencies across models, rules, and custom resources.
- Enhanced upload capabilities: You can now upload multiple files at once, watch the progress of the upload, and even, stop the upload if necessary.
- Similar enhanced scoring capability: You can now watch the progress or cancel the uploading and processing of a large data file.
Control Center Improvements
For users of ADAPA on the Amazon Cloud, the 3.0 release is being made available through the control center in three new "High-Memory" instance types: "Extra Large", "Double Extra-Large", and "Quadruple Extra Large". For pricing, please refer to the ADAPA on the Cloud pricing page.
With this release, all existing and new "High-Memory" instance types are available for launching in the Asia-Pacific (Singapore) region, in addition to US and EU regions.
Besides all the new features and improvements listed above, ADAPA 3.0 also comes with a complete set of sample files which include predictive models, rule sets, resource files, reports, as well as test files so that you can get up and running with your decisioning solution right away. The sample files can be downloaded from the help page. In the same page you will also find links to all the ADAPA documentation including the "ADAPA Solutions Guide" which describes how the sample files come together into a single and powerful predictive decisioning solution, which you can make it your own.
Try ADAPA 3.0 today!
If you are already a subscriber, launch your instances. If not, it is easy to subscribe to ADAPA on Amazon. Sign up now!
Wednesday, May 19, 2010
Monday, May 17, 2010
It was just a matter of time!
You can now execute predictive models directly from your iPhone. Yes, very convenient given that now predictions are made available to you no matter where you are.
This is what the new iPhone web app developed by Dymatrix offers to DynaMine customers: A mobile real-time next best offer system powered by ADAPA. With this web app, DynaMine users can leverage the power of predictive models deployed on ADAPA instantly.
The web app is the fruit of a recent partnership between Zementis and Dymatrix to jointly deliver operational predictive analytics to a variety of companies. Based in Stuttgart, Germany, Dymatrix is a premier consulting firm with broad expertise in the implementation of business analytics technologies, processes and solutions, such as automated adaptive model training and scoring. Its DynaMine framework offers automated adaptive model training and model management capabilities with a certified interface to the Zementis ADAPA decision engine for instant deployment and high-performance scoring.
The ability to efficiently move models between platforms is a testament to the success of PMML (Predictive Model Markup Language) as the conduit in which true interoperability becomes a reality ... and yes, that now includes your iPhone.
Thursday, March 11, 2010
Zementis, Inc. today announced a strategic partnership with the Dymatrix Consulting Group to jointly deliver operational predictive analytics to a variety of companies.
Based in Stuttgart, Germany, Dymatrix is a premier consulting firm with broad expertise in the implementation of business analytics technologies, processes and solutions, such as automated adaptive model training and scoring.
For the full press release, click HERE.
Dymatrix DynaMine® framework offers automated adaptive model training and model management capabilities with a certified interface to the Zementis ADAPA decision engine for instant deployment and high-performance scoring.
The ability to efficiently move models between platforms is a testament to the success of PMML (Predictive Model Markup Language) as the conduit in which true interoperability becomes a reality.
The partnership between Zementis and Dymatrix Consulting Group offers yet another way for companies and individuals around the world to use ADAPA as their deployment platform for predictive solutions.
Monday, February 22, 2010
Scorecards are extremely popular, since they provide a clear and effective to way to predict outcome for a variety of situations. By clear I mean that the logic behind the scores obtained via a scorecard can be easily understood and appreciated. Scorecards are effective for situations in which you want to predict the probability of someone or something being "bad" or "good". These probabilities can then be readily used for decision making.
Scorecards, as any data mining model, contain a set of inputs fields which are used to predict a certain target value. This prediction can be seen as an assessment about a prospect, a customer, or a scenario for which an outcome is predicted based on historical data. In a scorecard, input fields, also referred to as characteristics (for example, "Age"), are broken down into attributes (for example, "20-29" and "30-39" age groups) with specific partial scores associated with them. These scores represent the influence of the input attributes on the target and are readily available for inspection. For example, a high partial score for a particular attribute could imply a heavy dependence of the target value on that attribute. Partial scores are then summed up so that an overall score can be obtained for the target value (is it good? Or, is it bad?).
ADAPA provides two different ways for scorecards to be represented. The first being through rules as described in the ADAPA Scorecard Guide and the second, as described in here, through the use of PMML.
Given that PMML does not offer a specific scorecard element, we use a RegressionModel element to implement different score allocation strategies and to compute the overall score. More specifically, we show in here how to represent different attributes (categorical or continuous ... and complex) and their corresponding partial scores by the use of data transformations and built-in functions (see tutorial on data processing in PMML).
Score Allocation for Categorical Attributes
Typical score allocation for categorical attributes is done by associating a partial score with each attribute. In the PMML code shown below, input field "var1" may contain one of the following values (or attributes): "positive", "negative", and "neutral", for which a partial score is defined (see table below for score allocation details). Note, that it also accounts for missing values. In the PMML example, the resulting partial score is assigned to derived variable "derivedVar1".
Note that for categorical attributes, we simply use the MapValues element as described in to implement score allocation. If the input field consists of a large set of attributes, score allocation can be easily implemented by using the element TableLocator.
Score Allocation for Continuous Attributes
In the PMML code shown below, continuous input field "var2" has been discretized into three ranges or attributes: "less than 100", "greater or equal to 100 and less than 200", and "greater than 200" (see table for score allocation details). Note, that it also accounts for missing values. In the PMML example, the resulting partial score is assigned to derived variable "derivedVar2a".
Note that for continuous attributes, we simply use the Discretize element to implement score allocation.
Score Allocation for Complex Attributes
If the attributes are complex, built-in functions can be used to implement score allocation. The PMML code shown below uses several built-in functions to implement a complex score allocation (see table for details). As in the previous score allocation examples, this also accounts for missing values. In the PMML example, the resulting partial score is assigned to derived variable "derivedVar2b".
Note that we are using built-in function IF-THEN-ELSE in conjunction with arithmetic operators to implement the necessary logic. Built-in functions in PMML are very powerful and can be used to represent a variety of complex score allocation strategies.
Computing the Overall Score
The score allocation examples shown in here include input attributes which are either related to "var1", which is a categorical field, or to "var2", which is continuous. For each attribute associated with these fields, a partial score is assigned to each derived field: "derivedVar1", "derivedVar2a", and "derivedVar2b" by using a PMML transformation.
Finally, as shown in the PMML code below, the sum of all partial scores is implemented via a regression table for which all regression coefficients are set to 1. Note also that score allocation for all attributes are represented as transformations placed inside the LocalTransformations element.
A file containing the full PMML example shown here as well as data for model verification can be found in the PMML Examples page of the Zementis website.
There is a whole lot of information posted in different websites about Scorecards, PMML and ADAPA. If you want to learn more on how to represent data processing in PMML including different ways to perform score allocation for complex attributes, make sure to check our PMML Data Processing Primer.
For a more detailed list of ADAPA features, feel free to take a tour of ADAPA on the Cloud or check what is inside the ADAPA box. If you are still unsure about any of the features or would like to learn more about them and how ADAPA can represent scorecards using rules, drop us a note or give us a call. You can find our contact information in the contacts page of the Zementis website.
Tuesday, February 16, 2010
Thursday, February 4, 2010
Zementis is constantly adding new features to ADAPA. In its latest 2.20 release (February 2, 2010), it adds important new features with automatic PMML conversion, model composition, and an improved Web Console experience.
Integrated PMML Converter (and Corrector)
With this release, the popular PMML Converter has been incorporated seamlessly into ADAPA. As a result, ADAPA can now directly import older PMML versions. You may be already aware that many of the modeling tools still export older versions of PMML. Now, you can directly import these older versions of PMML into ADAPA without having to manually upgrade them externally. In addition, we have added functionality to automatically correct a number of common problems found in PMML generated by some popular modeling tools, allowing the models to work as intended.
Remember that we recommend that you always run a score matching test to validate the imported models against your data and modeling environment. And if you find that we still missed something in our PMML converter/corrector, do let us know.
ADAPA now supports composing of multiple models into a single model. This important feature supports a variety of model composition cases such as model selection or segmentation, model sequencing, and value post-processing. For examples and instructions on how to represent model composition in PMML and ADAPA, please refer to the ADAPA Predictive Analytics Guide available for download from the ADAPA Console Help page.
The ADAPA Web Console now allows you to download any of the imported models. This feature makes it easy to review your models, including any warning messages generated during the import. In addition, given that all imported models are automatically converted, the download feature allows you to retrieve and review the upgraded (and possible corrected) version of your model.
For more information on this exciting new feature, please feel free to contact us.
Thursday, January 21, 2010
Granted, we as humans are moving more and more to cities (think of China) and so as discussed in the article by 2050, 70% of us will be living in cities. This brings several thoughts into mind ... but what IBM is interested on is data and advanced analytics "to make sense of it all". Data is envisioned to come more and more from sensors in our roadways, rivers (water flow and pollution monitoring), bridges, buildings, as well as other sources cited in the article such as health-care and education.
But how can we make sense of it all? As highlighted in the Newsweek article, by bringing standards to the table. If we together as a nation and as humans are to benefit from the intelligence inherent to all this data, agreeing on standards is a must. Think of the Xmas day plot that turned into a data fiasco ... lots of data in different formats and places, but no intelligence or connectivity.
Standards such as PMML (Predictive Model Markup Language) which allows for predictive analytics models as well as data pre- and post-processing to be moved between analytical tools is not only a great example of companies getting together to make the move in the right direction, but also something to be celebrated. IBM has been part of the DMG (Data Mining Group), the committee shaping up PMML for many years now. Other companies such as SAS, Microstrategy, Equifax, and Zementis are also part of the commitee as well as open-source companies such as KNIME and Rapid-I.
IBM is betting on advance analytics and standards ... as so are we!