《Mapping the Privacy-Utility Tradeoff in Mobile Phone Data for
Development》
---
作者:
Alejandro Noriega-Campero, Alex Rutherford, Oren Lederman, Yves A. de
Montjoye, and Alex Pentland
---
最新提交年份:
2018
---
英文摘要:
Today\'s age of data holds high potential to enhance the way we pursue and monitor progress in the fields of development and humanitarian action. We study the relation between data utility and privacy risk in large-scale behavioral data, focusing on mobile phone metadata as paradigmatic domain. To measure utility, we survey experts about the value of mobile phone metadata at various spatial and temporal granularity levels. To measure privacy, we propose a formal and intuitive measure of reidentification risk$\\unicode{x2014}$the information ratio$\\unicode{x2014}$and compute it at each granularity level. Our results confirm the existence of a stark tradeoff between data utility and reidentifiability, where the most valuable datasets are also most prone to reidentification. When data is specified at ZIP-code and hourly levels, outside knowledge of only 7% of a person\'s data suffices for reidentification and retrieval of the remaining 93%. In contrast, in the least valuable dataset, specified at municipality and daily levels, reidentification requires on average outside knowledge of 51%, or 31 data points, of a person\'s data to retrieve the remaining 49%. Overall, our findings show that coarsening data directly erodes its value, and highlight the need for using data-coarsening, not as stand-alone mechanism, but in combination with data-sharing models that provide adjustable degrees of accountability and security.
---
中文摘要:
今天的数据时代有很大潜力加强我们在发展和人道主义行动领域追求和监测进展的方式。我们研究了大规模行为数据中数据效用与隐私风险之间的关系,重点研究了手机元数据作为范例域。为了衡量效用,我们调查了专家在不同时空粒度级别上的手机元数据的价值。为了度量隐私,我们提出了一种形式直观的重新识别风险度量$\\ unicode{x2014}$信息比率$\\ unicode{x2014}$,并在每个粒度级别计算它。我们的结果证实了数据效用和可再识别性之间存在着明显的权衡,其中最有价值的数据集也最容易再识别。当数据按邮政编码和每小时级别指定时,外界只知道一个人数据的7%,就足以重新识别和检索剩下的93%。相比之下,在市政府和日常层面指定的最没有价值的数据集中,重新识别平均需要51%或31个数据点的外部知识才能检索剩余的49%。总的来说,我们的研究结果表明,数据粗化会直接侵蚀其价值,并强调需要使用数据粗化,而不是作为独立的机制,而是与提供可调整的责任和安全程度的数据共享模型相结合。
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Computers and Society 计算机与社会
分类描述:Covers impact of computers on society, computer ethics, information technology and public policy, legal aspects of computing, computers and education. Roughly includes material in ACM Subject Classes K.0, K.2, K.3, K.4, K.5, and K.7.
涵盖计算机对社会的影响、计算机伦理、信息技术和公共政策、计算机的法律方面、计算机和教育。大致包括ACM学科类K.0、K.2、K.3、K.4、K.5和K.7中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Cryptography and Security 密码学与安全
分类描述:Covers all areas of cryptography and security including authentication, public key cryptosytems, proof-carrying code, etc. Roughly includes material in ACM Subject Classes D.4.6 and E.3.
涵盖密码学和安全的所有领域,包括认证、公钥密码系统、携带证明的代码等。大致包括ACM主题课程D.4.6和E.3中的材料。
--
一级分类:Economics 经济学
二级分类:General Economics 一般经济学
分类描述:General methodological, applied, and empirical contributions to economics.
对经济学的一般方法、应用和经验贡献。
--
一级分类:Quantitative Finance 数量金融学
二级分类:Economics 经济学
分类描述:q-fin.EC is an alias for econ.GN. Economics, including micro and macro economics, international economics, theory of the firm, labor economics, and other economic topics outside finance
q-fin.ec是econ.gn的别名。经济学,包括微观和宏观经济学、国际经济学、企业理论、劳动经济学和其他金融以外的经济专题
--
---
PDF下载:
-->
Mapping_the_Privacy-Utility_Tradeoff_in_Mobile_Phone_Data_for_Development.pdf
(831.53 KB)


雷达卡



京公网安备 11010802022788号







