楼主: oliyiyi
9981 808

【回复有奖】Deep Residual Networks for Image Classification with Python + NumPy   [推广有奖]

版主

大师

83%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
6
论坛币
618353 个
学术水平
1307 点
热心指数
1415 点
信用等级
1218 点
经验
324032 点
帖子
8524
精华
66
在线时间
4808 小时
注册时间
2007-5-21
最后登录
2018-11-15

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

oliyiyi 发表于 2016-7-7 23:05:39 |显示全部楼层

This post outlines the results of an innovative Deep Residual Network implementaion for Image Classification using Python and NumPy.

By Daniele Ciriello, Independent Machine Learning Researcher.

Overview


I wanted to implement “Deep Residual Learning for Image Recognition” from scratch with Python for my master’s thesis in computer engineering, I ended up implementing a simple (CPU-only) deep learning framework along with the residual model, and trained it on CIFAR-10, MNIST and SFDDD. Results speak by themselves.

Convolutional Neural Networks for Computer Vision


On Monday, June 13rd, I graduated with a master’s degree in computer engineering, presenting a thesis on deep convolutional neural networks for computer vision. For now it is available only in Italian, I am working on the english translation but don’t know if and when I’ll got the time to finish it, so I try to describe in brief each chapter.

The document is composed as follows:

  • Introduction

    An introduction of the topic, the description of the thesis’ structure and a rapid description of the neural networks history from perceptrons to NeoCognitron.

  • Neural Networks fundamentals

    A description of the fundamental mathematical concepts behind deep learning.

  • State of the Art

    A description of the main concepts that permitted the goals achieved in the last decade, an introduction of image classification and object localization problems, ILSVRC and the models that obtained best results from 2012 to 2015 in both the tasks.

  • Implementing a Deep Learning Framework

    This chapter contains an explanation on how to implement both forward and backward steps for each one of the layers used by the residual model, the residual model’s implementation and some method to test a network before training.

  • Experimental Results

    After developed the model and a solver to train it, I conducted several experiments with the residual model on CIFAR-10, in this chapter I show how I tested the model and how the behavior of the network changes when one removes the residual paths, applies data-augmenting functions to reduce overfitting or increases the number of the layers, then I show how to foil a trained network using random generated images or images from the dataset.

  • Conclusions

    Here I describe other results obtained training the same model on MNIST and SFDDD (check below for more infos), an overview of the project and possible future works with it.


Thesis links:

Presentation links:

Below I describe in brief how I got all of that, the sources I used, the structure of the residual model I trained and the results I obtained. Please keep in mind that my first objective was to develop and train the model so I didn’t spent much time on the design aspect of the framework, but I’m working on it (and pull requests are welcome)!

PyFunt, PyDatSet and Deep Residual Networks


Pyfunt is a simple pythonic imperative deep learning framework: it mainly provides the implementations for the forward and backward steps for most notorious neural layers, some useful initialization function, and a solver, that is essentially a class that you instantiate and to which you pass the model to be trained and the data loaded with pydatset, which contains functions to import some dataset and a set of functions to artificially augment the training set. Just to clarify, PyFunt and PyDatSet are the names for the repos, pyfunt and pydatset are the names for the packages (so you import them with from pydatset import ...).

The residual model implementation resides in deep-residual-networks-pyfunt, which also contains the train.py file.

The residual model proposed in the reference paper is derived from the VGG model, in which convolution filters of 3x3 applied with a step of 1 if the number of channels is constant, 2 if the number of features got doubled (this is done to preserve the computational complexity on each convolutional layer). So the residual model is composed by a cascade of many residual block (or residual layers), which are groups of convolutional layers in series where the output of the last layer output is added to the original input to the block, authors suggest a couple of conv layer for each residual block should work well.

Each residual block is composed where, if dimensionality reduction is applied (using a convolution step of 2 instead of 1), downsampling and zero-padding must be applied to the input before the addition, in order to permit the sum of the two ndarrays (skip_path + conv_out).

A parametric residual network have in total (6*n)+2 layers, composed as below (right values represents the dimension of a [3,32,32] sample like CIFAR images).

You can see below a sort of package diagram that shows how train.py uses the other components to train the residual model.

After I had every piece I started experimenting what happens when you remove the residual paths, when you apply or not data augmenting functions for the training set, when increase the number of layers or the number of filters for each layer. Below you can find some image of the results but I suggest to give a look at the respective JuPyter notebooks (in addition to thesis and presentation linked above), for a deeper understanding, as you can find a more exhaustive description of the results on all datasets I show below.

Results


I trained the residual model on CIFAR-10, MNIST and SFDDD, and results are really exciting, at least for me. The networks learn well in nearly every test I’ve done, obviously my limit is the capacity of my desktop PC.

CIFAR-10


One of the experiments on CIFAR-10 implied training a simple 20 layers resnet, applying data-augmenting regularization functions I obtained a similar result showed in the reference paper as you can see below.

The training for this model took approximately 10 hours. more infos are available in this jupyter ipython notebook from the repo’s docs folder.

MNIST


MNIST is a much simpler dataset in comparison with CIFAR-10, so the training times are relatively shorter and I also tried to use the half of the number of filters of each conv layers.

More infos for experiments with residual networks on MNIST are available here.

In the image above you can see all the wrongly classified validation samples from the 32 layers network, trained for just 30 epochs(!). upper left are the ground-truth class, lower left the wrong classification from the net and lower right the second classification for confidence.

SFDDD



State Farm Distracted Driver Detection is a dataset from State Farm on kaggle.com, it contains 640x480 images of drivers in 10 classes of distraction. For this dataset I decided to resize all the images to 64x48 and use random cropping of 32x32 for training and using the center 32x32 crop for testing. I also tried to directly scale all images to 32x32 but results were worse (confirming the fact that scaling the images doesn’t help a lot conv nets to learn more general features).

Below you can see the learning curves for two models of respectively 32 and 44 layers, it looks that both models produce a low error after 80 epochs, but the problem here is that for the validation set I used 2k images randomly extracted from the training set, so my validation set has a correlation factor which is higher than the correlation between the original training set and the validation set proposed by State Farm (on which I got an error of circa 3%).

Below you can see the saliency maps for six images for the class “talking on phone with right hand”, in where the lighter zones represent the portions of the images that most contributed to a correct classification from the network.

Other infos will be available here after competition ends.

Final Words


I hope my projects could help you learn something new. If not, maybe you can teach me something new, comments and pull requests are welcome as always!

Sources


When I started to think I wanted to implement “Deep Residual Networks for Image Recognition”, on GitHub there was only this project from gcr, based on Lua + Torch, this code really helped me a lot when I had to implement the residual model.

Neural Networks and Deep Learning by Michael Nielsen contains a really well organized exhaustive introduction to the subject and a lot of code to help the user understand what is going on on each part of the process.

colah.github.io by Christopher Olah has a lot of very well written posts about deep learning and NNs, for example I found this post about convolution layers really illuminating.

Stanford’s CS231N by Andrej Karpathy et Al., a really interesting course about CNN for visual recognition, I mainly used the course material and my assignments’ solutions to buildPyFunt.





已有 3 人评分学术水平 热心指数 信用等级 收起 理由
cindydong1130 + 1 + 1 + 1 精彩帖子
nhabel + 1 + 1 + 1 精彩帖子
janyiyi + 1 + 1 + 1 精彩帖子

总评分: 学术水平 + 3  热心指数 + 3  信用等级 + 3   查看全部评分

本帖被以下文库推荐

缺少币币的网友请访问有奖回帖集合
http://bbs.pinggu.org/thread-3990750-1-1.html
stata SPSS
hjtoh 发表于 2016-7-7 23:09:58 来自手机 |显示全部楼层
oliyiyi 发表于 2016-7-7 23:05
This post outlines the results of an innovative Deep Residual Network implementaion for Image Classi ...
搂住给力
已有 1 人评分论坛币 收起 理由
oliyiyi + 15 精彩帖子

总评分: 论坛币 + 15   查看全部评分

回复

使用道具 举报

condmn 发表于 2016-7-7 23:22:20 来自手机 |显示全部楼层

回帖奖励 +3

oliyiyi 发表于 2016-7-7 23:05
This post outlines the results of an innovative Deep Residual Network implementaion for Image Classi ...
(⊙o⊙)…
回复

使用道具 举报

tt_abc 发表于 2016-7-7 23:43:31 |显示全部楼层
顶顶顶
已有 1 人评分经验 收起 理由
oliyiyi + 5 精彩帖子

总评分: 经验 + 5   查看全部评分

回复

使用道具 举报

nndbc 发表于 2016-7-7 23:44:07 |显示全部楼层
学习学习
已有 1 人评分经验 收起 理由
oliyiyi + 5 精彩帖子

总评分: 经验 + 5   查看全部评分

回复

使用道具 举报

tt_abc 发表于 2016-7-7 23:44:36 |显示全部楼层
支持支持
已有 1 人评分论坛币 收起 理由
oliyiyi + 3 精彩帖子

总评分: 论坛币 + 3   查看全部评分

回复

使用道具 举报

h2h2 发表于 2016-7-8 01:24:10 |显示全部楼层

回帖奖励 +3

Thanks
回复

使用道具 举报

h2h2 发表于 2016-7-8 04:09:00 |显示全部楼层

回帖奖励 +3

谢谢分享,学习学习
回复

使用道具 举报

h2h2 发表于 2016-7-8 04:15:24 |显示全部楼层
赚论坛币,感谢楼主~
回复

使用道具 举报

h2h2 发表于 2016-7-8 04:16:07 |显示全部楼层

回帖奖励 +3

支持楼主!
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 我要注册

GMT+8, 2018-11-19 01:00