楼主: jiqimao742
1148 3

[学习分享] 练习Creating Sample Datasets – Exercises [推广有奖]

  • 1关注
  • 4粉丝

已卖:25份资源

大专生

50%

还不是VIP/贵宾

-

威望
0
论坛币
275 个
通用积分
1.0000
学术水平
2 点
热心指数
7 点
信用等级
4 点
经验
1684 点
帖子
47
精华
0
在线时间
41 小时
注册时间
2016-9-28
最后登录
2022-2-26

楼主
jiqimao742 发表于 2016-10-8 20:00:15 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Creating Sample Datasets – Exercises
如何列出随机数据————

Creating sample data is a common task performed in many different scenarios.

R has several base functions that make the sampling process quite easy and fast.

Below is an explanation of the main functions used in the current set of exercices:

1. set.seed() – Although R executes a random mechanism of sample creation, set.seed() function allows us to reproduce the exact sample each time we execute a random-related function.

2. sample() – Sampling function. The arguments of the function are:
x – a vector of values,
size – sample size
replace – Either use a chosen value more than once or not
prob – the probabilities of each value in the input vector.

3. seq()/seq.Date() – Create a sequence of values/dates, ranging from a ‘start’ to an ‘end’ value.

4. rep() – Repeat a value/vector n times.

5. rev() – Revert the values within a vector.

You can get additional explanations for those functions by adding a ‘?’ prior to each function’s name.

Answers to the exercises are available here.
If you have different solutions, feel free to post them.

Exercise 1
1. Set seed with value 1235
2. Create a Bernoulli sample of 100 ‘fair coin’ flippings.
Populate a variable called fair_coin with the sample results.

Exercise 2
1. Set seed with value 2312
2. Create a sample of 10 integers, based on a vector ranging from 8 thru 19.
Allow the sample to have repeated values.
Populate a variable called hourselect1 with the sample results

Exercise 3
1. Create a vector variable called probs with the following probabilities:
‘0.05,0.08,0.16,0.17,0.18,0.14,0.08,0.06,0.03,0.03,0.01,0.01’
2. Make sure the sum of the vector equals 1.

Exercise 4
1. Set seed with value 1976
2. Create a sample of 10 integers, based on a vector ranging from 8 thru 19.
Allow the sample to have repeated values and use the probabilities defined in the previous question.
Populate a variable called hourselect2 with the sample results

Exercise 5
Let’s prepare the variables for a biased coin:
1. Populate a variable called coin with 5 zeros in a row and 5 ones in a row
2. Populate a variable called probs having 5 times value ‘0.08’ in a row and 5 times value ‘0.12’ in a row.
3. Make sure the sum of probabilities on probs variable equals 1.

Exercise 6
1. Set seed with value 345124
2. Create a biased sample of length 100, having as input the coin vector, and as probabilities probs vector of probabilities.
Populate a variable called biased_coin with the sample results.

Exercise 7
Compare the sum of values in fair_coin and biased_coin

Exercise 8
1. Create a ‘Date’ variable called startDate with value 9th of February 2010 and a second ‘Date’ variable called endDate with value 9th of February 2005
2. Create a descending sequence of dates having all 9th’s of the month between those two dates. Populate a variable called seqDates with the sequence of dates.

Exercise 9
Revert the sequence of dates created in the previous question, so they are in ascending order and place them in a variable called RevSeqDates

Exercise 10
1. Set seed with value 10
2. Create a sample of 20 unique values from the RevSeqDates vector.


答案


Below are the solutions to these exercises on creating a sample dataset.

#####################                  ##    Exercise 1    ##                  #####################set.seed(1235)fair_coin <- sample(c(0,1), 100, replace = TRUE)#####################                  ##    Exercise 2    ##                  #####################set.seed(2312)hourselect1 <- sample(c(8:19),10,replace=TRUE)hourselect1
##  [1] 14 19 16 18 13 15  8 10 10 16
#####################                  ##    Exercise 3    ##                  #####################probs <- c(0.05,0.08,0.16,0.17,0.18,0.14,0.08,0.06,0.03,0.03,0.01,0.01)sum(probs)
## [1] 1
#####################                  ##    Exercise 4    ##                  #####################set.seed(1976)hourselect2 <- sample(c(8:19),10,replace=TRUE,prob = probs)hourselect2
##  [1] 15 11 12 15 12  9 14 12 10  9
#####################                  ##    Exercise 5    ##                  #####################coin <- rep(c(0,1),each=5)coin
##  [1] 0 0 0 0 0 1 1 1 1 1
probs <- rep(c(0.08,0.12),each=5)sum(probs)
## [1] 1
#####################                  ##    Exercise 6    ##                  #####################set.seed(345124)biased_coin <- sample(coin, 100, replace = TRUE,prob=probs)#####################                  ##    Exercise 7    ##                  #####################sum(fair_coin)
## [1] 52
sum(biased_coin)
## [1] 63
#####################                  ##    Exercise 8    ##                  #####################startDate <- as.Date("2010-02-09")endDate <- as.Date("2005-02-09")seqDates <- seq.Date(startDate, endDate, by = "-1 month")#####################                  ##    Exercise 9    ##                  #####################RevSeqDates <- rev(seqDates)#####################                  ##    Exercise 10   ##                  #####################set.seed(10)sample(RevSeqDates,20,replace=FALSE)
##  [1] "2007-08-09" "2006-08-09" "2007-03-09" "2008-06-09" "2005-06-09"##  [6] "2006-02-09" "2006-05-09" "2006-04-09" "2007-10-09" "2006-12-09"## [11] "2007-11-09" "2007-06-09" "2005-07-09" "2009-03-09" "2006-06-09"## [16] "2006-09-09" "2005-04-09" "2006-01-09" [size=0.85em]"2006-07-09" "2008-01-09"





二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Exercises Creating datasets exercise dataset performed different function creation current

本帖被以下文库推荐

沙发
stzhao 在职认证  发表于 2016-10-9 17:26:52
Copy自R-bloggers

藤椅
日新少年 学生认证  发表于 2016-10-9 23:48:17
stzhao 发表于 2016-10-9 17:26
Copy自R-bloggers
求blog链接

板凳
jiqimao742 发表于 2016-10-10 01:22:53
日新少年 发表于 2016-10-9 23:48
求blog链接
https://www.r-bloggers.com/creating-sample-datasets-exercises/

这是链接

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-30 19:37