请选择 进入手机版 | 继续访问电脑版
楼主: oliyiyi
1064 6

Software Engineering vs Machine Learning Concepts [推广有奖]

版主

泰斗

0%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
272091 个
通用积分
31269.1729
学术水平
1435 点
热心指数
1554 点
信用等级
1345 点
经验
383778 点
帖子
9599
精华
66
在线时间
5466 小时
注册时间
2007-5-21
最后登录
2024-3-21

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

oliyiyi 发表于 2017-2-19 12:41:00 |显示全部楼层 |坛友微信交流群

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

本帖隐藏的内容

Not all core concepts from software engineering translate into the machine learning universe. Here are some differences I've noticed.

Divide and Conquer A key technique in software engineering is to break a problem down into simpler subproblems, solve those subproblems, and then compose them into a solution to the original problem. Arguably, this is the entire job, recursively applied until the solution can be expressed in a single line in whatever programming language is being used. The canonical pedagogical example is the Tower of Hanoi.

Unfortunately, in machine learning we never exactly solve a problem. At best, we approximately solve a problem. This is where the technique needs modification: in software engineering the subproblem solutions are exact, but in machine learning errors compound and the aggregate result can be complete rubbish. In addition apparently paradoxical situations can arise where a component is “improved” in isolation yet aggregate system performance degrades when this “improvement” is deployed (e.g., due to the pattern of errors now being unexpected by downstream components, even if they are less frequent).

Does this mean we are doomed to think holistically (which doesn't sound scalable to large problems)? No, but it means you have to be defensive about subproblem decomposition. The best strategy, when feasible, is to train the system end-to-end, i.e., optimize all components (and the composition strategy) together rather than in isolation. Often this is not feasible, so another alternative (inspired by Bayesian ideas) is to have each component report some kind of confidence or variance along with the output in order to facilitate downstream processing and integration.

In practice, when systems get to a particular scope, there needs to be decomposition in order to divide the work up amongst many people. The fact that this doesn't work right now in machine learning is a problem, as elegantly described by Leon Bottou in his ICML 2015 invited talk.

Speaking of another concept that Leon discussed $\ldots$

Correctness In software engineering, an algorithm can be proven correct, in the sense that given particular assumptions about the input, certain properties will be true when the algorithm terminates. In (supervised) machine learning, the only guarantee we really have is that if the training set is an iid sample from a particular distribution, then performance on another iid sample from the same distribution will be close to that on the training set and not too far from optimal.

Consequently anyone who practice machine learning for a living has an experimental mindset. Often times I am asked whether option A or option B is better, and most of the time my answer is “I don't know, let's try both and see what happens.” Maybe the most important thing that people in machine learning know is how to assess a model in such a way that is predictive of generalization. Even that is a “feel” thing: identifying and preventing leakage between training and validation sets (e.g., by stratified and temporal sampling) is something you learn by screwing up a few times; ditto for counterfactual loops. Kaggle is great for learning about the former, but the latter seems to require making mistakes on a closed-loop system to really appreciate.

Experimental “correctness” is much weaker than the guarantees from other software, and there are many ways for things to go badly. For example in my experience it is always temporary: models go stale, it just always seems to happen. Ergo, you need to plan to be continually (hence, automatically) retraining models.

Reuse This one is interesting. Reuse is the key to leverage in traditional software engineering: it's not just more productive to reuse other code, but every line of code you write yourself is an opportunity to inject defects. Thus, reuse not only allows you to move faster but also make less mistakes: in return, you must pay the price of learning how to operate a piece of software written by others (when done well, this price has been lowered through good organization, documentation, and community support).

Some aspects of machine learning exhibit exactly the same tradeoff. For instance, if you are writing your own deep learning toolkit, recognize that you are having fun. There's nothing wrong with having fun, and pedagogical activities are arguably better than playing video games all day. However, if you are trying to get something done, you should absolutely attempt to reuse as much technology as you can, which means you should be using a standard toolkit. You will move faster and make less mistakes, once you learn how to operate the standard toolkit.

Machine learning toolkits are “traditional software”, however, and are designed to be reused. What about model reuse? That can be good as well, but the caveats about decomposition above still apply. So maybe you use a model which produces features from a user profile as inputs to your model. Fine, but you should version the model you depend upon and not blindly upgrade without assessment or retraining. Reusing the internals of another model is especially dangerous as most machine learning models are not identifiable, i.e., have various internal symmetries which are not determined by the training procedure. Couple an embedding to a tree, for instance, and when the next version of the embedding is a rotation of the previous one, you can watch your performance go to crap immediately.

Basically, model reuse creates strong coupling between components which can be problematic if one component is changed.

Testing I find the role of software testing in machine learning to be the trickiest issue of all. Without a doubt testing is necessary, but the challenge in using something like property-based testing is that the concept that is being captured by the machine learning component is not easily characterized by properties (otherwise, you would write it using non-ml software techniques). To the extent there are some properties that the ml component should exhibit, you can test for these, but unless you incorporate these into the learning procedure itself (e.g., via parameter tying or data augmentation) you are likely to have some violations of the property that are not necessarily indicative of defects.

Having a “extra-test” data set of with minimal acceptable quality is a good idea: this could be easy examples that “any reasonable model” should get correct. There's also self-consistency: at Yahoo they used to ship models with a set of input-output pairs that were computed with the model when it was put together, and if the loaded model didn't reproduce the pairs, the model load was cancelled. (That should never happen, right? Surprise! Maybe you are featurizing the inputs using a library with a different version or something.)

Monitoring the metrics (proxy and true) of deployed models is also good for detecting problems. If the proxy metric (i.e., the thing on which you actually trained your model and estimated generalization performance) is going south, the inputs to your model are changing somehow (e.g., nonstationary environment, change in feature extraction pipeline); but if the proxy metric is stable while the true metric is going south, the problem might be in how the outputs of your model are being leveraged.

Unfortunately what I find is many software systems with machine learning components are tested in a way that would make traditional software engineers cringe: we look at the output to see if it is reasonable. Crazy! As machine learning becomes a more pervasive part of software engineering, this state of affairs must change.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Engineering engineerin Engineer Learning software Software

本帖被以下文库推荐

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

使用道具

h2h2 发表于 2017-2-19 17:50:30 |显示全部楼层 |坛友微信交流群
谢谢分享

使用道具

paulinokok 发表于 2017-2-19 22:26:01 |显示全部楼层 |坛友微信交流群
thank you

使用道具

paulinokok 发表于 2017-2-19 22:27:05 |显示全部楼层 |坛友微信交流群
i thought there is an ebook?. thabnks

使用道具

feng026 发表于 2017-2-22 16:28:44 |显示全部楼层 |坛友微信交流群
。。。。。。

使用道具

sacromento 学生认证  发表于 2017-2-23 08:46:38 |显示全部楼层 |坛友微信交流群
看看,学习了!

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-3-28 21:17