
The three models available in the ADAPA demo instance and their respective data set available for scoring in the sample Excel file are:
IrisMLRModel: A multinomial logistic regression model trained with the Iris data set.The Iris data set: This is perhaps the best known data set to be found in the pattern recognition literature. The data set contains 3 classes representing different types of the Iris plant. Each class is represented by 50 records. For more info on the Iris data set, please check the Iris page at the UCI Repository of Machine Learning Databases - http://archive.ics.uci.edu/ml/datasets/Iris (Asuncion, A. & Newman, D.J. (2007). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science). Note that for scoring, the class has been omitted from the data set. It will be produced by ADAPA as a result of the scoring process together with the probability associated with each of the three classes of Iris plant covered by the data: Setosa, Versicolor, and Virginica.
AuditSVMModel: A support vector machine trained with the Audit data set.The Audit data set: This data set is supplied as part of the Rattle package - http://rattle.togaware.com (it is also available for download as a CSV file from http://rattle.togaware.com/audit.csv). This is an artificial data set consisting of fictional clients who have been audited, perhaps for tax refund compliance. For each case an outcome is recorded (whether the taxpayer's claims had to be adjusted or not) and any amount of adjustment that resulted is also recorded. Note that for scoring, the adjusted field has been omitted from the data set. It will be produced by ADAPA as a result of the scoring process.
LoanNNModel: A neural network model trained with mortgage loan data.The Loan data set: This data set contains loan level data for several adjustable rate mortgage (ARM) loans. ARM loans originated by subprime lenders in the US were a key factor behind the financial crisis that began in 2008 and affected the entire world. The data set contains eleven features which are used as model inputs. The output is a score signifying the risk of default for each particular loan. The score ranges from 0 to 1000 in which the higher the score, the higher the risk of default. Note that the score produced by ADAPA for this data set is hypothetical.
0 comments:
Post a Comment