You should upload your data as a CSV file. Make sure the data file contains all the input fields you actually use in your model. If you are missing a field, ADAPA will not generate any scores.Also, the first row should contain the name of the variables.
For example, for the model "Audit_NN" available in the PMML Examples page of the Zementis website, the first 6 rows of the .csv data file used to validate the model look like:
AGE,Employment,Education,Marital,Occupation,Income,Sex, Deductions,Hours,Adjusted
38,Private,College,Unmarried,Service,81838,Female,0,72,0
35,Private,Associate,Absent,Transport,72099,Male,0,30,0
32,Private,HSgrad,Divorced,Clerical,154676.74,Male,0,40,0
45,Private,Bachelor,Married,Repair,27743.82,Male,0,55,1
60,Private,College,Married,Executive,7568.23,Male,0,40,0
ADAPA also supports the use of double quotes around any of the fields (data or field names). Therefore, the following line is also compatible with ADAPA:
"38","Private","College","Unmarried","Service", ...
You should use double quotes to include commas inside a string as shown below:
"Ryan, Private": without double quotes, ADAPA would treat this single value as two strings.
You should also use double quotes to represent blank characters before or after a string. For example:
" AGE", "AGE ", and "AGE" represent different values whereas "AGE" and AGE are the same.
To represent double quotes inside a string, repeat them twice: "COLOR:""YELLOW""" will be interpreted by ADAPA as COLOR:"YELLOW". Make sure you only use the two adjacent double quotes inside a string surrounded by double quotes.
For more on how to represent your .csv file, click here (beware though that ADAPA does not allow fields to contain embedded line-breaks. In ADAPA, a record is represented by a single line).
Predicted Field
Also, note that in the example above the variable "Adjusted" is actually the predicted field. It is present in the example above since we are using this file for validation (score matching). Obviously, if you are only trying to score your data, you should leave the predicted column out. ADAPA will return computed scores for each entry.
0 comments:
Post a Comment