楼主: ReneeBK
744 0

How Much of R Is Written in the R Language? [推广有奖]

  • 1关注
  • 62粉丝

VIP

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49422 个
通用积分
52.2304
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57815 点
帖子
4006
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
My boss sent me an email (on my day off!) asking me just how much of R is written in the R language. This is very simple if you use R and a Unix-like system. It also gives me a good excuse to defend the title of this blog. It's librestats, not projecteulerstats, afterall.
So I grabbed the R-2.13.1 source package from the cran and wrote up a little script that would look at all .R, .c, and .f files in the archive, record the language (R, C, or Fortran), number of lines of code, and the file the code came from; then it's just a matter of dumping all that to a csv (converted to .xls (in LibreOffice) because WordPress hates freedom).
        We'll talk in a minute about just how you would generate that csv--but first let's address the original question.
       By a respectable majority, most of the source code files of core R are written in R:

            At first glance, it seems like Fortran doesn't give much of a contribution. However, when we look at the proportion of lines of code, we see something more reasonable:


         So there you have it. Roughly 22% of R is written in R. I know some people want R to be written in R for some crazy reason; but really, if anything, that 22% is too high. Trust me, you really want C and Fortran to be doing all the heavy lifting so that things stay nice and peppy.

        Besides, this is a fairly irrelevant issue, in my opinion. What matters is that people outside of Core R are writing in R. Look at the extra packages repo and you'll see a very different story from the above graphic. That's something SAS certainly can't say, since people who want to do anything other than call some cookie-cutter SAS proc have to use IML or that ridiculous SAS macro language--each of which is somehow even more of a hilarious mess than base SAS.
        Ok, so how do we get that data? I actually have a much better script than the one I'm about to describe. The new one automatically grabs every source package from the cran that you don't already have and starts digging in on them, dumping everything out into one big csv so you can watch trending. It's interesting to see the transition from R being almost entirely (92%) in C to seeing it slowly drop down to ~52%. But that's a different post for a different day because I have a few kinks to work out with that script before I would feel comfortable releasing it.
        So here's how this system works. It's basically the dumbest possible solution; I'm pretty good at those, if I may say so myself. Basically the shell script hops into across the R-version/src/ folder and gets a line count of each .R, .c, and .f file. That's it; here it is:   

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:R language Language written Much lang package written system number simple

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-23 01:53