toby3003 发表于 2014-3-22 01:24 
谢谢您认真的回复,但是我现在的情况是,我有大概2万多糖尿病患者的数据,随访一共量差不多30万次随访,我 ...
You have actually had a very good idea by counting the number of patients who have been visited for a specific number. Instead you probably should count the number of visits for each patient.
Following @jmpamao 's example:
- data <- read.table(text="id time dose
- 1 1 a
- 1 2 b
- 1 3 c
- 1 4 d
- 1 5 e
- 2 1 f
- 2 2 g
- 2 3 h
- 2 4 i
- 3 1 g
- 3 2 k
- 3 3 l
- 4 1 d
- 4 2 f
- 4 3 g
- 4 4 r
- 4 5 u
- ",header=T)
- # Count the number of visits for each patient
- # Assuming each row represents one visit
- frq <- table(data$id)
- # frq is a named vector where its names are ids and elements are counts of visits
- slct <-
- data[data$id %in% as.numeric(names(frq[frq == 3])), ]
复制代码Keep in mind the names of frq is character in type, if the id field in the original dataset is not of type character you will need to convert it explicitly (as done above).
@jmpamao, this is the first time I see someone using read.table in such a "datalines" fashion.