Monday, November 5, 2012

R: how to draw added-variable plot (partial-regression plot)

Partial regression is very helpful in detecting influential points in multiple regression.  The one in SAS and introducion of added variable plot is here:

http://www.songhuiming.com/2011/10/sas-how-to-draw-added-variable-plot.html


Here is how to do it in R with library(car):

library(foreign)
read.spss("http://dl.dropbox.com/u/10684315/ucla_reg/crime.sav",to.data.frame=T)->crime

head(crime)

## change upcase column names to lowcase
names(crime)<-tolower(names(crime))
row.names(crime)<-crime$state

lm(crime~pctmetro+pctwhite+poverty+single,crime)->crime_reg1
summary(crime_reg1)

library(car)
avPlots(crime_reg1,"single",labels=row.names(crime),id.method=cooks.distance(crime_reg1),id.n=51)


The output graph is given below. It is the graph of residuals of crime vs residuals of single while both crime and single are adjusted by the other variables(pctmetro+pctwhite+poverty). From the graph it shows DC should be taken care of. AK and WV are also the points that may be influential points.





No comments:

Post a Comment