摘要翻译:
在设计查询评估引擎或在考虑性能的情况下微调RDF存储时,了解用户如何裁剪他们的SPARQL查询是至关重要的。在本文中,我们分析了从DBPedia和SWDF公共端点日志中提取的300万个真实世界的SPARQL查询。我们的目标是从句法和结构的角度找出哪些是使用最多的语言元素,特别关注三元模式和连接,因为它们确实是评估阶段最昂贵的SPARQL操作。我们已经确定,大多数查询都很简单,包含很少的三元模式和连接,它们是最常见的连接类型Subject-Subject、Subject-Object和Object-Object。图形模式通常是星形的,尽管存在三个模式链,但它们通常很短。
---
英文标题:
《An Empirical Study of Real-World SPARQL Queries》
---
作者:
Mario Arias, Javier D. Fern\'andez, Miguel A. Mart\'inez-Prieto, Pablo
de la Fuente
---
最新提交年份:
2011
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Information Retrieval 信息检索
分类描述:Covers indexing, dictionaries, retrieval, content and analysis. Roughly includes material in ACM Subject Classes H.3.0, H.3.1, H.3.2, H.3.3, and H.3.4.
涵盖索引,字典,检索,内容和分析。大致包括ACM主题课程H.3.0、H.3.1、H.3.2、H.3.3和H.3.4中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Human-Computer Interaction 人机交互
分类描述:Covers human factors, user interfaces, and collaborative computing. Roughly includes material in ACM Subject Classes H.1.2 and all of H.5, except for H.5.1, which is more likely to have Multimedia as the primary subject area.
包括人为因素、用户界面和协作计算。大致包括ACM学科课程H.1.2和所有H.5中的材料,除了H.5.1,它更有可能以多媒体作为主要学科领域。
--
---
英文摘要:
Understanding how users tailor their SPARQL queries is crucial when designing query evaluation engines or fine-tuning RDF stores with performance in mind. In this paper we analyze 3 million real-world SPARQL queries extracted from logs of the DBPedia and SWDF public endpoints. We aim at finding which are the most used language elements both from syntactical and structural perspectives, paying special attention to triple patterns and joins, since they are indeed some of the most expensive SPARQL operations at evaluation phase. We have determined that most of the queries are simple and include few triple patterns and joins, being Subject-Subject, Subject-Object and Object-Object the most common join types. The graph patterns are usually star-shaped and despite triple pattern chains exist, they are generally short.
---
PDF链接:
https://arxiv.org/pdf/1103.5043


雷达卡



京公网安备 11010802022788号







