楼主: oliyiyi
794 1

3 practical thoughts on why deep learning performs so well [推广有奖]

版主

泰斗

0%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
271951 个
通用积分
31269.3519
学术水平
1435 点
热心指数
1554 点
信用等级
1345 点
经验
383775 点
帖子
9598
精华
66
在线时间
5468 小时
注册时间
2007-5-21
最后登录
2024-4-18

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

By Aureli Soria-Frisch, Neurolelectrics.

The superior performance of deep learning relative to other machine learning methodologies has been commented in several forums and magazines in recent times. I would like to post today on three reasons that, in my opinion, are the basis of this commented superiority. I am not the first to comment on this issue [1], and for sure I won’t be the last. But I would like to extend the discussion by taking into account the practical reasons behind the success of deep learning. Hence if you are looking for its theoretical background you would do better to look for it in the deep learning literature, where the “Hamiltonian of the spin glass model [2]”, the exploitation of compositional functions to cope with the curse of dimensionality [3], their capability to best represent the simplicity of physics-based functions [4], and the flattening of the data manifolds [5] have been proposed. All of these are related, but I would like to comment here on those practical aspects that turn deep learning into a useful technology to tackle some practical classification problems and applications. But not all problems and applications. Deep Learning methodologies appear today as the ultimate pattern recognition technology, as formerly Support Vector Machines and Random Forests,. But based on the No Free Lunch Theorem [6] there is no optimal algorithm for all possible problems. Therefore it is always good to keep the evaluation of the classification performance [7] in mind. Performance evaluation constitutes a basic stage in machine learning applications, and is directly linked with the Deep Learning success as described below.

The first important reason for the Deep Learning success is the integration of feature extraction within the training process.

Not so many years ago, pattern recognition was focused on the classification stage. Thus feature extraction was treated as a somehow independent problem. It was a part very much based on artisan manual work and on expert knowledge. You used to invite an expert on the topic you wanted to solve to form part of your developing team. For instance, you should count on an experienced electrophysiologist if you want to classify EEG epochs, or on a graphologist if you want to recognize handwriting. This expert knowledge was used to select the features of interest in a particular problem. In contrast Deep Learning approaches do not set up a priori the features of interest. Deep Learning methods train the feature extraction and the classification stages together. For instance  a set of image filters or primitives is trained in the first layers of the  classification network for image recognition. This was a concept already proposed in the Brain-Computer Interface community some years ago, i.e. Common Spatial Filters (CSP) were trained in order to adapt the feature extraction stage to each of the BCI users.

Moreover Deep Learning has succeeded in previously unsolved problems because it encourages the collection of large data sets and the systematic integration of performance evaluation in the development process.

These are two sides of the same coin. Once you have large data sets, you can not just conduct the performance evaluation through a manual procedure. You need to automate the process as much as possible. The automation implies the implementation of the cross-validation stage, and therefore its integration in the development process. Large data set collection and performance evaluation have found very good support as well from the popularization of data analysis challenges and platforms. The first challenges and competitions were organized around the most important computer vision and pattern recognition conferences. This was the case, for instance, with the PASCAL [8] or the ImageNet [9] challenges. These challenges allowed for the first time to count on large image data sets, and most importantly, with associated ground truth labels for systematic performance evaluation of the algorithms. Furthermore the performance evaluation was done blind on the test set ground truth. So, it was not possible to tune the parameters to boost performance. The same challenge concept was then applied on data analysis platforms, of which the most popular is Kaggle [10] but it is not the only one, e.g. see DrivenData [11], InnoCentive [12]. The data contest platforms make use of the same concepts: they provide a data set for training and a blind data set for testing, plus a platform to compare the achieved performances among different competing groups. This has been definitely a good playground for data science in general, and particularly for Deep Learning.

The last thought I would like to share with you is closely related to the former ones. None of the innovations mentioned above would have been possible without technology development.  

A  decrease in memory and storage prices [13] allows storing data sets of ever increasing volume. The well-known Moore’s law describes the terrific simultaneous increase in computational power. Lastly the explosion of network technologies have definitely democratized the access to both memory and computational power. Cloud repositories, and High Performance Computing are allowing an exponential increase in the complexity of the implemented architectures, i.e. the number of network layers and nodes taken into account. So far this has been demonstrated as the most successful path to follow. For the moment …

[1] http://www.kdnuggets.com/2016/08/yann-lecun-3-thoughts-deep-learning.html

[2] https://arxiv.org/pdf/1412.0233.pdf

[3] https://arxiv.org/pdf/1611.00740v2.pdf

[4] https://arxiv.org/abs/1608.08225

[5] http://ieeexplore.ieee.org/document/7348689/

[6] https://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization

[7] http://blog.neuroelectrics.com/how-good-is-my-computational-intelligence-algorithm-for-eeg-analysis/

[8] http://host.robots.ox.ac.uk/pascal/VOC/

[9] http://www.image-net.org/challenges/LSVRC/

[10] https://www.kaggle.com/

[11] https://www.drivendata.org/

[12] https://www.innocentive.com/

[13] http://www.jcmit.com/mem2015.htm


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Practical Thoughts Learning practic earning practical learning

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html
沙发
oliyiyi 发表于 2017-2-4 14:39:45 |只看作者 |坛友微信交流群
一边读,一边转,希望大家能喜欢这些文章

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-25 12:31