Dsci Final Repoot
Autor: vernonwang • July 19, 2012 • Essay • 786 Words (4 Pages) • 1,270 Views
Problem#1
Screen 1.1 Screen1.2
Question 1.1
The proportion of individual who purchased organic products is 1460/ (1460+4540) =24.3%
Question 1.2
Some other variables is different measurements for the same information, or some variables are reduplicative, so AGEGRP1, AGEGRP2, NEIGHBORHOOD, LCDATE, ORGANICS shouldn’t be included as input Variables.
Problem#2
Question 2.1
There are 12 leaves are selected based on the data.
Question 2.2
The variable named “Age” is used for the first split.
Question 2.3
There are 15 leaves are selected based on the data.
Question 2.4
The variable named “Age” is used for the first split.
Screen 2.1
Screen 2.2 Screen 2.3
Screen 2.4
Problem#3
Question 3.1
Yes, some imputation values are missing, like 412 missing value in AFFL, 611 missing value in AGE and 103 missing value in LTIME. This imputation is not needed to be done before generate Decision Tree model, since it’s only effects the regression model, and have no influence for Decision Tree model.
Question 3.2
GENDERF, AFFL, and AGE are included in the final model. The most important variable is GENDER.
Question 3.3
The data should warrant a transformation to improve the fit of a model and improve additive and correct non-normality. It’s used to modify some of the original variables’ distribution to make them more closely
...