Wednesday, November 28, 2012

R: Replicate plot with ggplot2

The purpose is to replicate theose scatter plot from ucla ats with ggplot2. The original plots from ucla ats is:  Scatter plot

ggplot2 has less to remember than the plot in R base. Generally, it has those important concepts:

1: mapping and scale: mapping the data to plot attributes, like map data to x, y or colour, group and so on.

2: geometric object: plot points, lines, or curves, histogram, bar chart and so on

3: statistics: like regression, smoothing, or statistics

4: coordinate: choose data to show

5: layer and facet: group the data in different plots

Now I will show how to draw the graphs in the website above.


## to replicate: http://www.ats.ucla.edu/stat/r/gbe/scatter.htm
 
rm(list=ls())
cat("\014")
hsb2 <- read.table('http://www.ats.ucla.edu/stat/r/modules/hsb2.csv', header=T, sep=",")
attach(hsb2)
head(hsb2)
str(hsb2)
 
library(ggplot2)
p<-ggplot(hsb2, aes(x=math, y=write))
 
## graph1
p+geom_point()
 
## graph2
p+geom_point()+stat_smooth(method="lm")
 

## graph3_4
p+geom_point(aes(colour=factor(female)))
 
(This graph 3-5 is different from the plots on ucla ats. It looks like there are some issues about their plot since if look at the two bigges math value for male, they are 75. But in their third plot this is lost. I have sent email to them to double check.)
## graph5
ggplot(hsb2, aes(x=math, y=write, colour=factor(female), shape=factor(female))) + geom_point() + geom_smooth(method="lm", fill=NA)




No comments:

Post a Comment