摘要翻译:
处理训练数据集中的缺失值以构造学习模型或提取有用信息是数据库中数据挖掘和知识发现的一项重要研究任务。近年来,通过考虑训练数据集的缺失值观测和其他观测的属性关系,提出了许多缺失值的估算方法。这些技术的主要缺陷在于,它们依赖于单一的方法,而不是将多种方法结合起来,这就是它们准确性较低的原因。为了提高缺失值估计的准确性,本文在关联规则挖掘中引入了一种新的部分匹配概念,与我们在前人工作中描述的全匹配概念相比,它具有更好的结果。我们的归责技术结合了关联规则中的部分匹配概念和k-最近邻方法。由于这是一种混合技术,因此它的精度比那些依赖于单一方法的技术要好得多。为了验证该方法的有效性,我们还提供了大量基准数据集的详细实验结果,与以前的方法相比,结果更好。
---
英文标题:
《Introducing Partial Matching Approach in Association Rules for Better
Treatment of Missing Values》
---
作者:
Shariq Bashir, Saad Razzaq, Umer Maqbool, Sonya Tahir, Abdul Rauf Baig
---
最新提交年份:
2009
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Databases 数据库
分类描述:Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.
涵盖数据库管理、数据挖掘和数据处理。大致包括ACM学科类E.2、E.5、H.0、H.2和J.1中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Data Structures and Algorithms 数据结构与算法
分类描述:Covers data structures and analysis of algorithms. Roughly includes material in ACM Subject Classes E.1, E.2, F.2.1, and F.2.2.
涵盖数据结构和算法分析。大致包括ACM学科类E.1、E.2、F.2.1和F.2.2中的材料。
--
---
英文摘要:
Handling missing values in training datasets for constructing learning models or extracting useful information is considered to be an important research task in data mining and knowledge discovery in databases. In recent years, lot of techniques are proposed for imputing missing values by considering attribute relationships with missing value observation and other observations of training dataset. The main deficiency of such techniques is that, they depend upon single approach and do not combine multiple approaches, that why they are less accurate. To improve the accuracy of missing values imputation, in this paper we introduce a novel partial matching concept in association rules mining, which shows better results as compared to full matching concept that we described in our previous work. Our imputation technique combines the partial matching concept in association rules with k-nearest neighbor approach. Since this is a hybrid technique, therefore its accuracy is much better than as compared to those techniques which depend upon single approach. To check the efficiency of our technique, we also provide detail experimental results on number of benchmark datasets which show better results as compared to previous approaches.
---
PDF链接:
https://arxiv.org/pdf/0904.3321


雷达卡



京公网安备 11010802022788号







