楼主: jgchen1966
1021 3

[学习分享] Big changes behind the scenes in R 3.5.0 [推广有奖]

  • 6关注
  • 29粉丝

已卖:864份资源

院士

62%

还不是VIP/贵宾

-

威望
0
论坛币
41 个
通用积分
1772.1216
学术水平
102 点
热心指数
134 点
信用等级
73 点
经验
6710 点
帖子
3978
精华
0
在线时间
4397 小时
注册时间
2004-10-19
最后登录
2026-1-22

楼主
jgchen1966 发表于 2018-4-28 17:01:46 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Big changes behind the scenes in R 3.5.0April 24, 2018
By David Smith

https://www.r-bloggers.com/big-changes-behind-the-scenes-in-r-3-5-0/




A major update to R is now available. The R Core group has announced the release of R 3.5.0, and binary versions for Windows and Linux are now available from the primary CRAN mirror. (The Mac release is forthcoming.)

Probably the biggest change in R 3.5.0 will be invisible to most users — except by the performance improvements it brings. The ALTREP project has now been rolled into R to use more efficient representations of many vectors, resulting in less memory usage and faster computations in many common situations. For example, the sequence vector 1:1000000 is now represented just by its start and end value, instead of allocating a vector of a million elements as earlier versions of R would do. So while R 3.4.3 takes about 1.5 seconds to run x <- 1:1e9 on my laptop, it's instantaneous in R 3.5.0.

There have been improvements in other areas too, thanks to ALTREP. The output of the sort function has a new representation: it includes a flag indicating that the vector is already sorted, so that sorting it again is instantaneous. As a result, running x <- sort(x) is now free the second and subsequent times you run it, unlike earlier versions of R. This may seem like a contrived example, but operations like this happen all the time in the internals of R code. Another good example is converting a numeric to a character vector: as.character(x) is now also instantaneous (the coercion to character is deferred until the character representation is actually needed). This has significant impact in R's statistical modelling functions, which carry around a long character vector that usually contains just numbers — the row names — with the design matrix. As a result, the calculation:

d <- data.frame(y = rnorm(1e7), x = 1:1e7)lm(y ~ x, data=d)

runs about 4x faster on my system. (It also uses a lot less memory: running the equivalent command with 10x more rows failed for me in R 3.4.3 but succeeded in 3.5.0.)

The ALTREP system is designed to be extensible, but in R 3.5.0 the system is used exclusively for the internal operations of R. Nonetheless, if you'd like to get a sneak peek on how you might be able to use ALTREP yourself in future versions of R, you can take a look at this vignette (with the caveat that the interface may change when it's finally released).

There are many other improvements in R 3.5.0 beyond the ALTREP system, too. You can find the full details in the announcement, but here are a few highlights:

  • All packages are now byte-compiled on installation. R's base and recommended packages, and packages on CRAN, were already byte-compiled, so this will have the effect of improving the performance of packages installed from Github and from private sources.
  • R's performance is better when many packages are loaded, and more packages can be loaded at the same time on Windows (when packages use compiled code).
  • Improved support for long vectors, by functions including object.size, approx and spline.
  • Reading in text data with readLines and scan should be faster, thanks to buffering on text connections.
  • R should handle some international data files better, with several bugs related to character encodings having been resolved.

Because R 3.5.0 is a major release, you will need to re-install any R packages you use. (The installr package can help with this.) On my reading of the release notes, there haven't been any major backwardly-incompatible changes, so your old scripts should continue to work. Nonetheless, given the significant changes behind the scenes, it might be best to wait for a maintenance release before using R 3.5.0 for production applications. But for developers and data science work, I recommend jumping over to R 3.5.0 right away, as the benefits are significant.

You can find the details of what's new in R 3.5.0 at the link below. As always, many thanks go to the R Core team and the other volunteers who have contributed to the open source R project over the years.

R-announce mailing list: R 3.5.0 is released



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:changes Scenes BEHIND change CHAN

已有 1 人评分论坛币 学术水平 热心指数 信用等级 收起 理由
cheetahfly + 10 + 1 + 1 + 1 观点有启发

总评分: 论坛币 + 10  学术水平 + 1  热心指数 + 1  信用等级 + 1   查看全部评分

鹑居鷇食,鸟行无彰

沙发
pika44 发表于 2018-4-28 17:39:24
消息有用,多谢

藤椅
420948492 发表于 2018-4-29 13:56:24
感谢分享,就是所有package重装比较麻烦

板凳
jgchen1966 发表于 2018-4-29 19:19:09
420948492 发表于 2018-4-29 13:56
感谢分享,就是所有package重装比较麻烦
  其实也不麻烦:常用的,基础性的,现在在用的,先装上,其他随用随装随更新。
   动态更新,是R的一大吸引力,即时反应数据学科的进步,让你保持常新。不至于变得单调乏味。
  一个R包,二三个月没用,再用时,常想着其说明手册可能已更新,可能需要重新下载一份了。。。

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2026-1-26 20:07