楼主: oliyiyi
89273 2410

【latex版】水贴   [推广有奖]

371
oliyiyi 发表于 2015-7-11 17:44:26
Word clouds have become a bit cliché, but I still think that they have a place in giving a high level overview of the content of a corpus. Here are the steps I took in putting together the word cloud for the International Conference on Machine Learning (2015).

372
oliyiyi 发表于 2015-7-11 17:45:01
We're looking for researchers with a passion for open source and data sharing, already working to shift research practice to be more collaborative, iterative and open. Fellows will spend 10 months starting September 2015 as community catalysts at their institutions, mentoring the next generation of open data practitioners and researchers and building lasting change in the global open science community.

Throughout their fellowship year, chosen fellows will receive training and support from Mozilla to hone their skills around open source and data sharing. They will also craft code, curriculum and other learning resources that help their local communities learn open data practices, and teach forward to their peers.

373
oliyiyi 发表于 2015-7-11 17:47:57
 一天在学校。

 老师说:“同学们今天的作文题目是《致老师》。”

 话音刚落,一位学生站起来说到:“老师你去医院吧,我们治不了你。”

374
oliyiyi 发表于 2015-7-11 17:57:48
女儿今年7岁,刚上完一年级,放暑假接她回家的路上。我问她:“其他的小朋友都领到奖状了,怎么你没有呢?”

 女儿老神气的说,那奖状都是老师哄小孩子的,我不稀罕!

375
oliyiyi 发表于 2015-7-11 17:58:40
窗外,雨淅淅沥沥的下。

 她望着我:见家长吧。

 我不禁心头一震,这么久了,她第一次跟我讲这样的话。

 我眼泛泪花,有点硬咽,试探问她:是不是有点早?

 她情绪有点激动:还敢讨价还价!你已经几天没有交作业了!叫家长!

376
oliyiyi 发表于 2015-7-11 18:37:38
(This article was first published on rud.is » R, and kindly contributed to R-bloggers)

It’s not on CRAN yet, but there’s a devtools-installable R package for getting data from theOMDB API.

It covers all of the public API endpoints:

  • find_by_id: Retrieve OMDB info by IMDB ID search
  • find_by_title: Retrieve OMDB info by title search
  • get_actors: Get actors from an omdb object as a vector
  • get_countries: Get countries from an omdb object as a vector
  • get_directors: Get directors from an omdb object as a vector
  • get_genres: Get genres from an omdb object as a vector
  • get_writers: Get writers from an omdb object as a vector
  • print.omdb: Print an omdb result
  • search_by_title: Lightweight omdb title search

Here’s a bit of it in action:

devtools::install_github("hrbrmstr/omdbapi")library(dplyr)library(pbapply) search_by_title("Captain America") # Source: local data frame [10 x 4]# # Title Year imdbID Type# 1 Captain America: The First Avenger 2011 tt0458339 movie# 2 Captain America: The Winter Soldier 2014 tt1843866 movie# 3 Captain America 1990 tt0103923 movie# 4 Captain America 1979 tt0078937 movie# 5 Iron Man and Captain America: Heroes United 2014 tt3911200 movie# 6 Captain America II: Death Too Soon 1979 tt0078938 movie# 7 Captain America 1944 tt0036697 movie# 8 Captain America 1966– tt0206474 series# 9 Captain America: Super Soldier 2011 tt1740721 game# 10 Comic Book Origins: Captain America - Winter Soldier 2014 tt3618126 movie search_by_title("Captain America", year_of_release=2013) # Source: local data frame [1 x 4]# # Title Year imdbID Type# 1 A Look Back at 'Captain America' 2013 tt3307378 movie games <- search_by_title("Captain America", type="game") glimpse(games) # Observations: 2# Variables:# $ Title (chr) "Captain America: Super Soldier", "Captain America and the A...# $ Year (chr) "2011", "1991"# $ imdbID (chr) "tt1740721", "tt0421939"# $ Type (chr) "game", "game" find_by_title(games$Title[1]) # Title: Captain America: Super Soldier# Year: 2011# Rated: N/A# Released: 2011-07-19# Runtime: N/A# Genre: Action# Director: Michael McCormick, Robert Taylor# Writer: Christos N. Gage# Actors: Hayley Atwell, Chris Evans, Sebastian Stan, Neal McDonough# Plot: You play the Sentinel of Liberty as you raid the Red Skull's scientist# minion, Armin Zola's, lair.# Language: English# Country: USA# Awards: N/A# Poster: http://ia.media-imdb.com/images/M/# MV5BMTUwMzQ0NjE5N15BMl5BanBnXkFtZTgwODI3MzQxMTE@._V1_SX300.jpg# Metascore: N/A# imdbRating: 7.2# imdbVotes: 271# imdbID: tt1740721# Type: game find_by_title("Game of Thrones", type="series", season=1, episode=1) # Title: Winter Is Coming# Year: 2011# Rated: TV-MA# Released: 2011-04-17# Runtime: 62 min# Genre: Adventure, Drama, Fantasy# Director: Timothy Van Patten# Writer: David Benioff (created by), D.B. Weiss (created by), George R.R.# Martin ("A Song of Ice and Fire" by), David Benioff, D.B.# Weiss# Actors: Sean Bean, Mark Addy, Nikolaj Coster-Waldau, Michelle Fairley# Plot: Jon Arryn, the Hand of the King, is dead. King Robert Baratheon plans# to ask his oldest friend, Eddard Stark, to take Jon's# place. Across the sea, Viserys Targaryen plans to wed his# sister to a nomadic warlord in exchange for an army.# Language: English# Country: USA# Awards: N/A# Poster: http://ia.media-imdb.com/images/M/# MV5BMTk5MDU3OTkzMF5BMl5BanBnXkFtZTcwOTc0ODg5NA@@._V1_SX300.jpg# Metascore: N/A# imdbRating: 8.5# imdbVotes: 12584# imdbID: tt1480055# Type: episode get_genres(find_by_title("Star Trek: Deep Space Nine", season=5, episode=7)) # [1] "Action" "Adventure" "Drama" get_writers(find_by_title("Star Trek: Deep Space Nine", season=4, episode=6)) # [1] "Gene Roddenberry (based upon "Star Trek" created by)"# [2] "Rick Berman (created by)" # [3] "Michael Piller (created by)" # [4] "David Mack" # [5] "John J. Ordover" get_directors(find_by_id("tt1371111")) # [1] "Tom Tykwer" "Andy Wachowski" "Lana Wachowski" get_countries(find_by_title("The Blind Swordsman: Zatoichi")) # [1] "Japan" ichi <- search_by_title("Zatoichi")bind_rows(lapply(ichi$imdbID, function(x) {  find_by_id(x, include_tomatoes = TRUE)})) -> zato par(mfrow=c(3,1)) boxplot(zato$tomatoUserMeter, horizontal=TRUE, main="Tomato User Meter", ylim=c(0, 100))boxplot(zato$imdbRating, horizontal=TRUE, main="IMDB Rating", ylim=c(0, 10))boxplot(zato$tomatoUserRating, horizontal=TRUE, main="Tomato User Rating", ylim=c(0, 5))

You can find out more at it’s github repo


缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

377
oliyiyi 发表于 2015-7-11 18:39:57

Poynter did a nice interactive piece on world population by income (i.e. “How Many Live on How Much, and Where”). I’m always on the lookout for optimized shapefiles and clean data (I’m teaching a data science certificate program starting this Fall) and the speed of the site load and the easy availability of the data set made this one a “must acquire”. Rather than just repeat Poynter’s D3-goodness, here’s a way to look at the income data in series of small multiple choropleths—using R & ggplot2—that involves:

  • downloading data & shapefiles from a web site
  • using dplyr & tidyr for data munging
  • applying custom fill color scale mapping in ggplot
  • ordering plots with a custom facet order (using factors)
  • tweaking the theme and aesthetics for a nicely finished result

By using D3, Poynter inherently made the data available. Pop open the “Developer Tools” in any browser, reload the page and look at the “Network” tab and you’ll see a list of files (you can sometimes see things in the source code, but this technique is often faster). The income data is a well-formed CSV file http://www.pewglobal.org/wp-content/themes/pew-global/interactive-global-class.csv and their highly optimized world map was also easy to discern http://www.pewglobal.org/wp-content/lib/js/world-geo.json. We’ll start by grabbing the map and using the same map projection that Poynter did (Robinson). Don’t be put off by all the library calls since one of the best parts of R is the ever-increasing repository of great packages to help you get things done.

library(httr)     # getting datalibrary(rgdal)    # working with shapefilelibrary(dplyr)    # awesome data manipulationlibrary(readr)    # faster reading of CSV datalibrary(stringi)  # string manipulationlibrary(stringr)  # string manipulationlibrary(tidyr)    # reshaping datalibrary(grid)     # for 'unit'library(scales)   # for 'percent'library(ggplot2)  # plottinglibrary(ggthemes) # theme_map # this ensures you only download the shapefile once and hides# errors and warnings. remove `try` and `invisible` to see messagestry(invisible(GET("http://www.pewglobal.org/wp-content/lib/js/world-geo.json",                  write_disk("world-geo.json"))), silent=TRUE) # use ogrListLayers("world-geo.json") to see file type & # layer info to use in the call to readOGR world <- readOGR("world-geo.json", "OGRGeoJSON")world_wt <- spTransform(world, CRS("+proj=robin"))world_map <- fortify(world_wt)

I would have liked to do fortify(world_wt, region="name") (since that makes working with filling in countries by name much easier in the choropleth part of the code) but that generated TopologyException errors (I’ve seen this happen quite a bit with simplified/optimized shapefiles and some non-D3 geo-packages). One can sometimes fix those with a strategic rgeos::gBuffer call, but that didn’t work well in this case. We can still use country names with a slight rejiggering of the fortified data frame using dplyr:

world_map %>%  left_join(data_frame(id=rownames(world@data), name=world@data$name)) %>%  select(-id) %>%  rename(id=name) -> world_map

Now it’s time to get the data. The CSV file has annoying spaces in it that causes R to interpret all the columns as strings, so we can use dplyr again to get them into the format we want them in. Note that I’m also making the percentages decimals so we can usepercent later on to easily format them.

# a good exercise would be to repeat the download code above # rather and make repeated calls to an external resourceread_csv("http://www.pewglobal.org/wp-content/themes/pew-global/interactive-global-class.csv") %>%  mutate_each(funs(str_trim)) %>%  filter(id != "None") %>%  mutate_each(funs(as.numeric(.)/100), -name, -id) -> dat

For this post, we’ll only be working with the actual share percentages, so let’s:

  • ignore the “change” columns
  • convert the data frame from wide to long
  • extract out the income levels (e.g. “Poor”, “Low Income”…)
  • set a factor order for them so our plots will be in the correct sequence
dat %>%  gather(share, value, starts_with("Share"), -name, -id) %>%  select(-starts_with("Change")) %>%  mutate(label=factor(stri_trans_totitle(str_match(share, "Share ([[:alpha:]- ]+),")[,2]),                      c("Poor", "Low Income", "Middle Income", "Upper-Middle Income", "High Income"),                      ordered=TRUE)) -> share_dat


缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

378
oliyiyi 发表于 2015-7-11 18:40:47
In comparative effectiveness studies of multicomponent, sequential interventions like blood product transfusion (plasma, platelets, red blood cells) for trauma and critical care patients, the timing and dynamics of treatment relative to the fragility of a patient’s condition is often overlooked and underappreciated. While many hospitals have established massive transfusion protocols to ensure that physiologically optimal combinations of blood products are rapidly available, the period of time required to achieve a specified massive transfusion standard (e.g. a 1:1 or 1:2 ratio of plasma or platelets:red blood cells) has been ignored. To account for the time-varying characteristics of transfusions, we use semiparametric rate models for multivariate recurrent events to estimate blood product ratios. We use latent variables to account for multiple sources of informative censoring (early surgical or endovascular hemorrhage control procedures or death). The major advantage is that the distributions of latent variables and the dependence structure between the multivariate recurrent events and informative censoring need not be specified. Thus, our approach is robust to complex model assumptions. We establish asymptotic properties and evaluate finite sample performance through simulations, and apply the method to data from the PRospective Observational Multicenter Major Trauma Transfusion study.

379
oliyiyi 发表于 2015-7-12 12:56:18
AirPlay 和 Google Cast 类的功能越来越受欢迎,难怪 Amazon 也要弄一个 Fling 来抗衡呢。这个 Fling 功能跟上述两项对手的服务一样,都是让用户把手机上的多媒体转播到电视上,而 Fling 则支持把 Android 和 iOS 装置上的视频、音讯和静态图片类的档案,转播到 Fire TV 或是 Fire TV Stick 上。目前 Amazon 已把其 SDK 发放,Karaoke Party 和 Rivet Radio 就是 Amazon App Store ...

380
oliyiyi 发表于 2015-7-12 12:56:59
去年在 Photokina 大展上那令人惊讶的松下最强随身机 LX100,一直都是小编跃跃欲试的相机产品(可惜一直拖到现在才试到啊!)它拥有旗舰级的 4K 录影实力,搭载具备良好对焦实力与拍摄画质的 MFT 大感光元件、超越大多可换镜式相机规格的等效 24-75mm f/1.7-2.8 大光圈镜头等,所以尽管机身比起前代的 LX7 尺寸要大上一点,光圈与焦段却小了一点,但其实带来了更棒的整体性能表现。加上 LX100 在机身上还新增了独立的 EV 与快门转盘;光圈与功能(对焦)环则是变成了独立的...

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-27 08:06