摘要翻译:
在这篇简短的论文中,我们研究了一个分布式节点网络中的长短时记忆(LSTM)结构的在线训练,其中每个节点都使用基于LSTM的结构进行在线回归。特别地,每个节点顺序地接收一个带有其标签的可变长度数据序列,并且只能与其邻居交换信息,以训练LSTM体系结构。我们首先为每个节点提供了一个通用的基于LSTM的回归结构。为了训练这种结构,我们将每个节点的LSTM方程以非线性状态空间形式表示,然后引入了一种高效的分布式粒子滤波(DPF)训练算法。我们还介绍了一种基于分布式扩展卡尔曼滤波(DEKF)的训练算法进行比较。在此,我们的基于DPF的训练算法保证了在一定条件下收敛到最优LSTM系数在均方误差(MSE)意义下的性能。我们以一阶梯度法的通信复杂度和计算复杂度来实现这一性能。通过模拟和实际的例子,我们说明了相对于现有方法的显著性能改进。
---
英文标题:
《Online Training of LSTM Networks in Distributed Systems for Variable
Length Data Sequences》
---
作者:
Tolga Ergen and Suleyman Serdar Kozat
---
最新提交年份:
2017
---
分类信息:
一级分类:Electrical Engineering and Systems Science 电气工程与系统科学
二级分类:Signal Processing 信号处理
分类描述:Theory, algorithms, performance analysis and applications of signal and data analysis, including physical modeling, processing, detection and parameter estimation, learning, mining, retrieval, and information extraction. The term "signal" includes speech, audio, sonar, radar, geophysical, physiological, (bio-) medical, image, video, and multimodal natural and man-made signals, including communication signals and data. Topics of interest include: statistical signal processing, spectral estimation and system identification; filter design, adaptive filtering / stochastic learning; (compressive) sampling, sensing, and transform-domain methods including fast algorithms; signal processing for machine learning and machine learning for signal processing applications; in-network and graph signal processing; convex and nonconvex optimization methods for signal processing applications; radar, sonar, and sensor array beamforming and direction finding; communications signal processing; low power, multi-core and system-on-chip signal processing; sensing, communication, analysis and optimization for cyber-physical systems such as power grids and the Internet of Things.
信号和数据分析的理论、算法、性能分析和应用,包括物理建模、处理、检测和参数估计、学习、挖掘、检索和信息提取。“信号”一词包括语音、音频、声纳、雷达、地球物理、生理、(生物)医学、图像、视频和多模态自然和人为信号,包括通信信号和数据。感兴趣的主题包括:统计信号处理、谱估计和系统辨识;滤波器设计;自适应滤波/随机学习;(压缩)采样、传感和变换域方法,包括快速算法;用于机器学习的信号处理和用于信号处理应用的机器学习;网络与图形信号处理;信号处理中的凸和非凸优化方法;雷达、声纳和传感器阵列波束形成和测向;通信信号处理;低功耗、多核、片上系统信号处理;信息物理系统的传感、通信、分析和优化,如电网和物联网。
--
---
英文摘要:
In this brief paper, we investigate online training of Long Short Term Memory (LSTM) architectures in a distributed network of nodes, where each node employs an LSTM based structure for online regression. In particular, each node sequentially receives a variable length data sequence with its label and can only exchange information with its neighbors to train the LSTM architecture. We first provide a generic LSTM based regression structure for each node. In order to train this structure, we put the LSTM equations in a nonlinear state space form for each node and then introduce a highly effective and efficient Distributed Particle Filtering (DPF) based training algorithm. We also introduce a Distributed Extended Kalman Filtering (DEKF) based training algorithm for comparison. Here, our DPF based training algorithm guarantees convergence to the performance of the optimal LSTM coefficients in the mean square error (MSE) sense under certain conditions. We achieve this performance with communication and computational complexity in the order of the first order gradient based methods. Through both simulated and real life examples, we illustrate significant performance improvements with respect to the state of the art methods.
---
PDF链接:
https://arxiv.org/pdf/1710.08744