Showing posts with label Input File. Show all posts
Showing posts with label Input File. Show all posts

Wednesday, March 19, 2008

What is the format I should use for my data file for scoring with the ADAPA Predictive Analytics demo?

You should upload your data as a CSV file. Make sure the data file contains all the input fields you actually use in your model. If you are missing a field, ADAPA will not generate any scores.

Also, the first row should contain the name of the variables.

For example, for the model "Audit_NN" available in the PMML Examples page of the Zementis website, the first 6 rows of the .csv data file used to validate the model look like:

Age,Employment,Education,Marital,Occupation,Income,Sex,Deductions,Hours,Adjusted
38,Private,College,Unmarried,Service,81838,Female,0,72,0
35,Private,Associate,Absent,Transport,72099,Male,0,30,0
32,Private,HSgrad,Divorced,Clerical,154676.74,Male,0,40,0
45,Private,Bachelor,Married,Repair,27743.82,Male,0,55,1
60,Private,College,Married,Executive,7568.23,Male,0,40,0

Note that the variable "Adjusted" is actually the predicted field. It is present in the example above since we are using this file for validation (score matching). Obviously, if you are only trying to score your data, you should leave the predicted column out. ADAPA will return computed scores for each entry.