摘要翻译:
场景理解包括许多相关的子任务,如场景分类、深度估计、目标检测等,这些子任务中的每一个都是众所周知的困难,目前已经有了针对其中许多子任务的最先进的分类器。这些分类器对相同的原始图像进行操作,并提供相关的输出。希望有一种算法能够捕获这种相关性,而不需要对任何分类器的内部工作进行任何改变。我们提出了反馈支持级联分类模型(FE-CCM),它联合优化了所有子任务,而每个子任务只需要一个与原始分类器的“黑箱”接口。我们使用两层分类器级联,这些分类器是原始分类器的重复实例化,第一层的输出作为输入馈入第二层。我们的训练方法包括一个反馈步骤,允许稍后的分类器提供早期分类器关于哪些错误模式需要关注的信息。我们的研究表明,我们的方法在场景理解领域的所有子任务中,包括深度估计、场景分类、事件分类、目标检测、几何标记和显著性检测,都显著地提高了性能。我们的方法在两个机器人应用中也提高了性能:一个物体抓取机器人和一个物体发现机器人。
---
英文标题:
《Towards Holistic Scene Understanding: Feedback Enabled Cascaded
Classification Models》
---
作者:
Congcong Li, Adarsh Kowdle, Ashutosh Saxena, Tsuhan Chen
---
最新提交年份:
2011
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Computer Vision and Pattern Recognition 计算机视觉与模式识别
分类描述:Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.
涵盖图像处理、计算机视觉、模式识别和场景理解。大致包括ACM课程I.2.10、I.4和I.5中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Artificial Intelligence 人工智能
分类描述:Covers all areas of AI except Vision, Robotics, Machine Learning, Multiagent Systems, and Computation and Language (Natural Language Processing), which have separate subject areas. In particular, includes Expert Systems, Theorem Proving (although this may overlap with Logic in Computer Science), Knowledge Representation, Planning, and Uncertainty in AI. Roughly includes material in ACM Subject Classes I.2.0, I.2.1, I.2.3, I.2.4, I.2.8, and I.2.11.
涵盖了人工智能的所有领域,除了视觉、机器人、机器学习、多智能体系统以及计算和语言(自然语言处理),这些领域有独立的学科领域。特别地,包括专家系统,定理证明(尽管这可能与计算机科学中的逻辑重叠),知识表示,规划,和人工智能中的不确定性。大致包括ACM学科类I.2.0、I.2.1、I.2.3、I.2.4、I.2.8和I.2.11中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Robotics 机器人学
分类描述:Roughly includes material in ACM Subject Class I.2.9.
大致包括ACM科目I.2.9类的材料。
--
---
英文摘要:
Scene understanding includes many related sub-tasks, such as scene categorization, depth estimation, object detection, etc. Each of these sub-tasks is often notoriously hard, and state-of-the-art classifiers already exist for many of them. These classifiers operate on the same raw image and provide correlated outputs. It is desirable to have an algorithm that can capture such correlation without requiring any changes to the inner workings of any classifier. We propose Feedback Enabled Cascaded Classification Models (FE-CCM), that jointly optimizes all the sub-tasks, while requiring only a `black-box' interface to the original classifier for each sub-task. We use a two-layer cascade of classifiers, which are repeated instantiations of the original ones, with the output of the first layer fed into the second layer as input. Our training method involves a feedback step that allows later classifiers to provide earlier classifiers information about which error modes to focus on. We show that our method significantly improves performance in all the sub-tasks in the domain of scene understanding, where we consider depth estimation, scene categorization, event categorization, object detection, geometric labeling and saliency detection. Our method also improves performance in two robotic applications: an object-grasping robot and an object-finding robot.
---
PDF链接:
https://arxiv.org/pdf/1110.5102