楼主: oliyiyi
1026 0

Your data vis “Spidey-sense” & the need for a robust “utility belt” [推广有奖]

版主

已卖:2995份资源

泰斗

1%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
66190 个
通用积分
31671.1867
学术水平
1454 点
热心指数
1573 点
信用等级
1364 点
经验
384134 点
帖子
9629
精华
66
在线时间
5508 小时
注册时间
2007-5-21
最后登录
2025-7-8

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

楼主
oliyiyi 发表于 2016-6-18 12:11:31 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
(This article was first published on R – rud.is, and kindly contributed to R-bloggers)

@theboysmithy did a great piece on coming up with an alternate view for a timeline for an FT piece.

Here’s an excerpt (read the whole piece, though, it’s worth it):

Here is an example from a story recently featured in the FT: emerging- market populations are expected to age more rapidly than those in developed countries. The figures alone are compelling: France is expected to take 157 years (from 1865 to 2022) to triple the proportion of its population aged over 65, from 7 per cent to 21 per cent; for China, the equivalent period is likely to be just 34 years (from 2001 to 2035).

[color=rgb(255, 255, 255) !important]



You may think that visualising this story is as simple as creating a bar chart of the durations ordered by length. In fact, we came across just such a chart from a research agency.
But, to me, this approach generates “the feeling” — and further scrutiny reveals specific problems. A reader must work hard to memorise the date information next to the country labels to work out if there is a relationship between the start date and the length of time taken for the population to age. The chart is clearly not ideal, but how do we improve it?

Alan went on to talk about the process of improving the vis, eventually turning to Joseph Priestly for inspiration. Here’s their makeover:


[color=rgb(255, 255, 255) !important]


Alan used D3 to make this, which had me head scratching for a bit. Bostock is genius & I :heart: D3 immensely, but I never really thought of it as a “canvas” for doing general data visualization creation for something like a print publication (it’s geared towards making incredibly data-rich interactive visualizations). It’s 100% cool to do so, though. It has fine-grained control over every aspect of a visualization and you can easily turn SVGs into PDFs or use them in programs like Illustrator to make the final enhancements. However, D3 is not the only tool that can make a chart like this.

I made the following in R (of course):


[color=rgb(255, 255, 255) !important]


The annotations in Alan’s image were (99% most likely) made with something like Illustrator. I stopped short of fully reproducing the image (life is super-crazy, still), but could have done so (the entire image is one ggplot2 object).

This isn’t an “R > D3” post, though, since I use both. It’s about (a) reinforcing Alan’s posits that we should absolutely take inspiration from historical vis pioneers (so read more!) + need a diverse visualization “utility belt” (ref: Batman) to ensure you have the necessary tools to make a given visualization; (b) trusting your “Spidey-sense” when it comes to evaluating your creations/decisions; and, (c) showing that R is a great alternative to D3 for something like this

Spider-man (you expected headier references from a dude with a shield avatar?) has this ability to sense danger right before it happens and if you’re making an effort to develop and share great visualizations, you definitely have this same sense in your DNA (though I would not recommend tossing pie charts at super-villains to stop them). When you’ve made something and it just doesn’t “feel right”, look to other sources of inspiration or reach out to your colleagues or the community for ideas or guidance. You canand do make awesome things, and you do have a “Spidey-sense”. You just need to listen to it more, add depth and breadth to your “utility belt” and keep improving with each creation you release into the wild.

R code for the ggplot vis reproduction is below, and it + the CSV file referenced are in this gist.

library(ggplot2) library(dplyr)  ft <- read.csv("ftpop.csv", stringsAsFactors=FALSE)  arrange(ft, start_year) %>%   mutate(country=factor(country, levels=c(" ", rev(country), "  "))) -> ft  ft_labs <- data_frame(   x=c(1900, 1950, 2000, 2050, 1900, 1950, 2000, 2050),   y=c(rep(" ", 4), rep("  ", 4)),   hj=c(0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5),   vj=c(1, 1, 1, 1, 0, 0, 0, 0) )  ft_lines <- data_frame(x=c(1900, 1950, 2000, 2050))  ft_ticks <- data_frame(x=seq(1860, 2050, 10))  gg <- ggplot()  # tick marks & gridlines gg <- gg + geom_segment(data=ft_lines, aes(x=x, xend=x, y=2, yend=16),                         linetype="dotted", size=0.15) gg <- gg + geom_segment(data=ft_ticks, aes(x=x, xend=x, y=16.9, yend=16.6),                         linetype="dotted", size=0.15) gg <- gg + geom_segment(data=ft_ticks, aes(x=x, xend=x, y=1.1, yend=1.4),                         linetype="dotted", size=0.15)  # double & triple bars gg <- gg + geom_segment(data=ft, size=5, color="#b0657b",                         aes(x=start_year, xend=start_year+double, y=country, yend=country)) gg <- gg + geom_segment(data=ft, size=5, color="#eb9c9d",                         aes(x=start_year+double, xend=start_year+double+triple, y=country, yend=country))  # tick labels gg <- gg + geom_text(data=ft_labs, aes(x, y, label=x, hjust=hj, vjust=vj), size=3)  # annotations gg <- gg + geom_label(data=data.frame(), hjust=0, label.size=0, size=3,                       aes(x=1911, y=7.5, label="France is set to take\n157 years to triple the\nproportion ot its\npopulation aged 65+,\nChina only 34 years")) gg <- gg + geom_curve(data=data.frame(), aes(x=1911, xend=1865, y=9, yend=15.5),                       curvature=-0.5, arrow=arrow(length=unit(0.03, "npc"))) gg <- gg + geom_curve(data=data.frame(), aes(x=1915, xend=2000, y=5.65, yend=5),                       curvature=0.25, arrow=arrow(length=unit(0.03, "npc")))  # pretty standard stuff here gg <- gg + scale_x_continuous(expand=c(0,0), limits=c(1860, 2060)) gg <- gg + scale_y_discrete(drop=FALSE) gg <- gg + labs(x=NULL, y=NULL, title="Emerging markets are ageing at a rapid rate",                 subtitle="Time taken for population aged 65 and over to double and triple in proportion (from 7% of total population)",                 caption="Source: http://on.ft.com/1Ys1W2H") gg <- gg + theme_minimal() gg <- gg + theme(axis.text.x=element_blank()) gg <- gg + theme(panel.grid=element_blank()) gg <- gg + theme(plot.margin=margin(10,10,10,10)) gg <- gg + theme(plot.title=element_text(face="bold")) gg <- gg + theme(plot.subtitle=element_text(size=9.5, margin=margin(b=10))) gg <- gg + theme(plot.caption=element_text(size=7, margin=margin(t=-10))) gg









二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Utility robust sense Data Need countries alternate developed published recently

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-8 00:53