摘要翻译:
情感识别有三个方面的挑战。首先,仅仅考虑单一的情态很难识别人的情感状态。其次,人工注释情感数据的成本很高。第三,由于不可预见的传感器故障或配置问题,情感数据经常会丢失模态。在本文中,我们在一个新的多视图深度生成框架下解决了所有这些问题。具体地说,我们提出使用具有共享潜在空间的多个情态特定生成网络来建模多情态情绪数据的统计关系。通过对共享潜在变量的后验逼近引入高斯混合假设,我们的框架可以从多个模态中学习联合深度表示,并同时评估每个模态的重要性。为了解决标记数据稀缺性问题,我们将多视图模型扩展到半监督学习场景,将半监督分类问题转化为一个专门的缺失数据归责任务。为了解决缺失模态问题,我们进一步扩展了半监督多视图模型来处理不完整数据,将缺失视图作为潜在变量,在推理过程中进行集成。这样,所提出的总体框架可以利用所有可用的(包括标记和未标记,以及完整和不完整)数据来提高其泛化能力。在两个真实的多模态情感数据集上进行的实验证明了该框架的优越性。
---
英文标题:
《Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality
Emotional Data》
---
作者:
Changde Du, Changying Du, Hao Wang, Jinpeng Li, Wei-Long Zheng,
Bao-Liang Lu, Huiguang He
---
最新提交年份:
2018
---
分类信息:
一级分类:Electrical Engineering and Systems Science 电气工程与系统科学
二级分类:Signal Processing 信号处理
分类描述:Theory, algorithms, performance analysis and applications of signal and data analysis, including physical modeling, processing, detection and parameter estimation, learning, mining, retrieval, and information extraction. The term "signal" includes speech, audio, sonar, radar, geophysical, physiological, (bio-) medical, image, video, and multimodal natural and man-made signals, including communication signals and data. Topics of interest include: statistical signal processing, spectral estimation and system identification; filter design, adaptive filtering / stochastic learning; (compressive) sampling, sensing, and transform-domain methods including fast algorithms; signal processing for machine learning and machine learning for signal processing applications; in-network and graph signal processing; convex and nonconvex optimization methods for signal processing applications; radar, sonar, and sensor array beamforming and direction finding; communications signal processing; low power, multi-core and system-on-chip signal processing; sensing, communication, analysis and optimization for cyber-physical systems such as power grids and the Internet of Things.
信号和数据分析的理论、算法、性能分析和应用,包括物理建模、处理、检测和参数估计、学习、挖掘、检索和信息提取。“信号”一词包括语音、音频、声纳、雷达、地球物理、生理、(生物)医学、图像、视频和多模态自然和人为信号,包括通信信号和数据。感兴趣的主题包括:统计信号处理、谱估计和系统辨识;滤波器设计;自适应滤波/随机学习;(压缩)采样、传感和变换域方法,包括快速算法;用于机器学习的信号处理和用于信号处理应用的机器学习;网络与图形信号处理;信号处理中的凸和非凸优化方法;雷达、声纳和传感器阵列波束形成和测向;通信信号处理;低功耗、多核、片上系统信号处理;信息物理系统的传感、通信、分析和优化,如电网和物联网。
--
一级分类:Computer Science 计算机科学
二级分类:Computer Vision and Pattern Recognition 计算机视觉与模式识别
分类描述:Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.
涵盖图像处理、计算机视觉、模式识别和场景理解。大致包括ACM课程I.2.10、I.4和I.5中的材料。
--
一级分类:Computer Science 计算机科学
二级分类:Machine Learning 机器学习
分类描述:Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.
关于机器学习研究的所有方面的论文(有监督的,无监督的,强化学习,强盗问题,等等),包括健壮性,解释性,公平性和方法论。对于机器学习方法的应用,CS.LG也是一个合适的主要类别。
--
一级分类:Computer Science 计算机科学
二级分类:Multimedia 多媒体
分类描述:Roughly includes material in ACM Subject Class H.5.1.
大致包括ACM学科类H.5.1中的材料。
--
---
英文摘要:
There are threefold challenges in emotion recognition. First, it is difficult to recognize human's emotional states only considering a single modality. Second, it is expensive to manually annotate the emotional data. Third, emotional data often suffers from missing modalities due to unforeseeable sensor malfunction or configuration issues. In this paper, we address all these problems under a novel multi-view deep generative framework. Specifically, we propose to model the statistical relationships of multi-modality emotional data using multiple modality-specific generative networks with a shared latent space. By imposing a Gaussian mixture assumption on the posterior approximation of the shared latent variables, our framework can learn the joint deep representation from multiple modalities and evaluate the importance of each modality simultaneously. To solve the labeled-data-scarcity problem, we extend our multi-view model to semi-supervised learning scenario by casting the semi-supervised classification problem as a specialized missing data imputation task. To address the missing-modality problem, we further extend our semi-supervised multi-view model to deal with incomplete data, where a missing view is treated as a latent variable and integrated out during inference. This way, the proposed overall framework can utilize all available (both labeled and unlabeled, as well as both complete and incomplete) data to improve its generalization ability. The experiments conducted on two real multi-modal emotion datasets demonstrated the superiority of our framework.
---
PDF链接:
https://arxiv.org/pdf/1808.02096


雷达卡



京公网安备 11010802022788号







