Tuesday, July 14, 2009

Data Mining for MySQL: Scoring your MySQL data just became a lot easier!

Many databases currently allow for data mining and analysis. SQL Server, for example, benefits from SQL Server Integration Services (SSIS) and Oracle from Oracle Data Miner. MySQL users, on the other hand, have in general used tools such as R and SPSS for data mining and to build statistical models. There is even an R package that builds an interface between R and MySQL (called RMySQL). Both R and SPSS (as well as a host of other statistical tools) are able to export PMML (Predictive Model Markup Language) which is the standard way to represent data mining models (for more on PMML, click here).

We have recently shown that one can easily deploy predictive models from SQL Server on the Amazon Cloud in a matter of minutes by using a script task in SSIS and the ADAPA Scoring Engine (see SSIS/ADAPA posting here). This time, we would like to make a similar case for MySQL.

Mind that building a model is a very different task than deploying one or executing it. The model development phase is usually mostly made of data analysis and massaging as well as feature selection. During model execution all you need are the most important data pieces (a much smaller sample of data fields than what you used during model development) to generate your decisions. In addition, the required pre-processing can be represented in PMML (for more on pre-processing and PMML, click here).

Model Deployment: Once a model exists, it can be easily uploaded in ADAPA which makes models available right away for execution via Web Services.

Model Execution: The task then is to extract data from your MySQL database, score it, and write the scored data back into the database. You can easily do that by using yet another open source tool: Jitterbit. It allows for data to be mapped from MySQL into a Web Service Call to ADAPA which returns the data back to Jitterbit and MySQL.

Process in Detail - Blog: We have described this process on a step-by-step basis here.

Process in Detail - Video
: We have also made a video describing this process. The YouTube version of this video can be accessed below, but we highly recommend the high-definition version of it.

Scoring your MySQL data just became a lot easier!

No comments:

Copyright © 2009-2014 Zementis Incorporated. All rights reserved.

Privacy - Terms Of Use - Contact Us