Begin with the “census2.csv” datafile, which contains census data on various tracts in a district. The fields in the data are • Total Population (thousands) • Professional degree (percent) • Employed age over 16 (percent) • Government employed (percent) • Median home value (dollars) a) Conduct a principal component analysis using the covariance matrix (the default for prcomp and many routines in other software), and interpret the results. How much of the variance is accounted for in the first component and why is this? b) Try dividing the MedianHomeValue field by 100,000 so that the median home value in the dataset is measured in $100,000’s rather than in dollars. How does this change the analysis? c) Compute the PCA with the correlation matrix instead. How does this change the result and how does your answer compare (if you did it) with your answer in b)? d) Analyze the correlation matrix for this dataset for significance, and also look for variables that are extremely correlated or uncorrelated. Discuss the effect of this on the analysis. e) Discuss what using the correlation matrix does and why it may or may not be appropriate in this case.
"Looking for a Similar Assignment? Order now and Get 10% Discount! Use Code "Newclient"
If this is not the paper you were searching for, you can order your 100% plagiarism free, professional written paper now!Order Now Just Browsing
All of our assignments are originally produced, unique, and free of plagiarism.Free Revisions Plagiarism Free 24x7 Support