摘要翻译:
需要访问信息资源的信息集成应用程序,如中介或混搭,目前依赖于用户手动发现并在应用程序中集成它们。手工资源发现是一个缓慢的过程,需要用户筛选通过基于关键字的搜索获得的结果。虽然搜索方法已经发展到包括来自文件内容、其元数据以及参考页的内容和链接结构的证据,但它们仍然没有充分涵盖为响应查询而动态生成文件的信息源----通常称为“隐藏网络”。最近流行的社交书签网站允许用户注释和共享关于各种信息源的元数据,为资源发现提供了丰富的证据。本文描述了一个社会化书签系统del.icio.us中用户注释过程的概率模型。然后,我们使用该模型自动查找与特定信息域相关的资源。我们对从\emph{del.icio.us}获得的数据的实验结果表明,该方法是一种帮助自动化资源发现任务的有希望的方法。
---
英文标题:
《Exploiting Social Annotation for Automatic Resource Discovery》
---
作者:
Anon Plangprasopchok and Kristina Lerman
---
最新提交年份:
2007
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Computers and Society 计算机与社会
分类描述:Covers impact of computers on society, computer ethics, information technology and public policy, legal aspects of computing, computers and education. Roughly includes material in ACM Subject Classes K.0, K.2, K.3, K.4, K.5, and K.7.
涵盖计算机对社会的影响、计算机伦理、信息技术和公共政策、计算机的法律方面、计算机和教育。大致包括ACM学科类K.0、K.2、K.3、K.4、K.5和K.7中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Digital Libraries 数字图书馆
分类描述:Covers all aspects of the digital library design and document and text creation. Note that there will be some overlap with Information Retrieval (which is a separate subject area). Roughly includes material in ACM Subject Classes H.3.5, H.3.6, H.3.7, I.7.
涵盖了数字图书馆设计和文献及文本创作的各个方面。注意,与信息检索(这是一个单独的主题领域)会有一些重叠。大致包括ACM课程H.3.5、H.3.6、H.3.7、I.7中的材料。
--
---
英文摘要:
Information integration applications, such as mediators or mashups, that require access to information resources currently rely on users manually discovering and integrating them in the application. Manual resource discovery is a slow process, requiring the user to sift through results obtained via keyword-based search. Although search methods have advanced to include evidence from document contents, its metadata and the contents and link structure of the referring pages, they still do not adequately cover information sources -- often called ``the hidden Web''-- that dynamically generate documents in response to a query. The recently popular social bookmarking sites, which allow users to annotate and share metadata about various information sources, provide rich evidence for resource discovery. In this paper, we describe a probabilistic model of the user annotation process in a social bookmarking system del.icio.us. We then use the model to automatically find resources relevant to a particular information domain. Our experimental results on data obtained from \emph{del.icio.us} show this approach as a promising method for helping automate the resource discovery task.
---
PDF链接:
https://arxiv.org/pdf/0704.1675