|
Luo and Tsai (2012) described neuropsychological scale data where the response variable is the
score from the trail making test (part A) measuring 334 patients’ processing speed in seconds,
and the covariates are years of education, age and diagnosis. Often these types of data have several
covariates (Figs 1 and 2 in Section 7 show scatter plots of scores versus years of education
and age), have unknown statistical distributions and known statistical procedures fail to work
properly. Stamey et al. (1989) examined the correlations between the level of prostate-specific
antigen and several clinical measures in 97 men who were about to receive a radical prostatectomy.
The goal (Hastie et al., 2009) is to predict the logarithm of prostate-specific antigen level,
lpsa, from a number of measurements including log-cancer-volume, lcavol, log-prostate-weight,
lcp, age and logarithm of capsular penetration, lcp (Figs 5–7 in Section 7 show scatter plots
of lpsa versus lcavol, lweight and lcp respectively). Although there are moderately high correlations
between the response and most covariates, a linear regression model does not consider
the non-linearity at the edges of the data.
|