When there’s a significant overlap among data points, scatter plots become less useful for observing relationships. Consider the following contrived example with 10,000 observations falling into two overlapping clusters of data:
set.seed(1234)
n <- 10000
c1 <- matrix(rnorm(n, mean=0, sd=.5), ncol=2)
c2 <- matrix(rnorm(n, mean=3, sd=2), ncol=2)
mydata <- rbind(c1, c2)
mydata <- as.data.frame(mydata)
names(mydata) <- c("x", "y")
If you generate a standard scatter plot between these variables using the following code
with(mydata,
plot(x, y, pch=19, main="Scatter Plot with 10,000 Observations"))
The smoothScatter() function uses a kernel-density estimate to produce smoothed color density representations of the scatter plot. The following code
with(mydata,
smoothScatter(x, y, main="Scatter Plot Colored by Smoothed Densities"))
Scatter plots and scatter-plot matrices display bivariate relationships. What if you want to visualize the interaction of three quantitative variables at once? In this case, you can use a 3D scatter plot.
For example, say that you’re interested in the relationship between automobile mileage, weight, and displacement. You can use the scatterplot3d() function in the scatterplot3d package to picture their relationship. The format is
scatterplot3d(x, y, z)
where x is plotted on the horizontal axis, y is plotted on the vertical axis, and z is plotted in perspective. Continuing the example,
Sample sizes for detecting significant effects in a one-way ANOVA
Sample sizes for detecting significant effects in a one-way ANOVA
library(pwr)
es <- seq(.1, .5, .01)
nes <- length(es)
samsize <- NULL
for (i in 1:nes){
result <- pwr.anova.test(k=5, f=es[i], sig.level=.05, power=.9)
samsize[i] <- ceiling(result$n)
}
plot(samsize,es, type="l", lwd=2, col="red",
ylab="Effect Size",
xlab="Sample Size (per cell)",
main="One Way ANOVA with Power=.90 and Alpha=.05")
Graphs such as these can help you estimate the impact of various conditions on your experimental design. For example, there appears to be little bang for the buck in increasing the sample size above 200 observations per group. We’ll look at another plotting example in the next section.