楼主: liuxf666
2385 8

【学习笔记】System Design - Design a Recommendation System [推广有奖]

  • 1关注
  • 3粉丝

已卖:70份资源

学科带头人

54%

还不是VIP/贵宾

-

威望
0
论坛币
13005 个
通用积分
409.9229
学术水平
109 点
热心指数
112 点
信用等级
103 点
经验
71218 点
帖子
1079
精华
0
在线时间
1538 小时
注册时间
2016-7-19
最后登录
2024-6-8

楼主
liuxf666 发表于 2019-4-24 09:03:15 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Heuristic solutionAlthough machine learning (ML) is commonly used in building recommendation systems, it doesn’t mean it’s the only solution. There are many cases where we want simpler approaches, for example, we may have very few data, or we may want to build a minimal solution fast etc..
In such cases, we can start with some heuristic solutions. In fact, there are lots of hacks we can do to build a simple recommendation system. For instance, based on videos a user has watched, we can simply suggest videos from same authors. We can also suggest videos with similar titles or labels. If we use the popularity (number of comments, shares) as another signal, the recommendation system can work pretty well as a baseline.
Collaborative filteringWhen talking about recommendation system, I can hardly avoid mentioning collaborative filtering (CF), which is the most popular technique used in recommendation systems. Since not everyone has a machine learning background, I won’t go deeper about the algorithm. In fact, the beauty of collaborative filtering is that the basic idea is so simple that everyone can easily understand it.
In a nutshell, to recommend videos for a user, I can provide videos liked by similar users. For instance, if user A and B have watched a bunch of same videos, it’s highly likely that user A will like videos liked by B. Of course, there are many ways to define what “similar” means here. It could be two users have liked same videos, it could also mean that they share the same location.
The above algorithm is called user-based collaborative filtering. Another version is called item-based collaborative filtering, which means to recommend videos (items) that are similar to videos a user has watched.
Feature engineerSo for Youtube video recommendation, what features can be used to build the recommendation system?
Usually, there are two types of features – explicit and implicit features. Explicit features can be ratings, favorites etc.. In Youtube, it can be the like/share/subscribe actions. Implicit features are less obvious. If a user has watched a video for only a couple of seconds, probably it’s a negative sign. Given a list of recommended videos, if a user clicks one over another, it can mean that he prefer to the one clicked. Usually, we need to explore a lot about implicit features.
Back to the Youtube problem, there are several features are quite obvious:
  • Like/share/subscribe – As mentioned above, they are strong signs about a user’s preferences.
  • Watch time
  • Video title/labels/categories
  • Freshness
It’s worth to note that when building machine learning systems, you have to experiment a lot with different combination of features so that you won’t know which one is good unless you give it a try.
InfrastructureIt can also be used to discuss infrastructure. Apparently, the system contains multiple steps/components. so how would you design the whole system in terms of infrastructure?
Given that comparing similar users/videos can be time-consuming on Youtube, this part should be done in offline pipelines. Therefore, we can divide the whole system into online and offline.
For the offline part, all the user models and videos need to store in distributed systems. Pipelines that calculate similar users/videos are also running regularly in order to keep data updated. In fact, for most machine learning systems, it’s common to use offline pipeline to process big data as you won’t expect it to finish with few seconds.
For the online part, based on the user profile and his actions (like videos just watched), we should be able to provide a list of recommended videos from offline data. Normally, the system fetches more videos than needed and then do filtering and ranking on the fly. We can filter videos that are obviously irrelevant like videos the user has watched. And then we should also rank the suggestions. Few factors should be considered include video popularity (share/comment/like numbers), freshness, quality and so on.
SummaryIn reality, there are many ways to improve the system that we haven’t covered yet. I’d like to briefly mention few techniques:
  • Freshness can be a very important factor. We should figure out how to recommend fresh content.
  • Eval is an essential component of recommendation system, which allows us to understand how well the system works.
  • To train the collaborative filtering system, we may also include video position signals. Usually, videos ranked on top have much higher chance to be clicked.

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Recommend commend Design System ATION

已有 1 人评分论坛币 学术水平 热心指数 信用等级 收起 理由
经管之家编辑部 + 100 + 3 + 3 + 3 精彩帖子

总评分: 论坛币 + 100  学术水平 + 3  热心指数 + 3  信用等级 + 3   查看全部评分

本帖被以下文库推荐

沙发
经管之家编辑部 在职认证  发表于 2019-4-24 09:15:17
为您点赞!

藤椅
从1万到一亿 在职认证  发表于 2019-4-24 10:07:27

板凳
分配法 在职认证  发表于 2019-4-24 11:46:35
加油楼主

报纸
充实每一天 发表于 2019-4-24 11:47:54 来自手机
点赞

地板
珍惜点滴 学生认证  发表于 2019-4-24 15:24:14
感谢分享,向您学习,赞!

7
HappyAndy_Lo 发表于 2019-4-24 20:49:52

8
albertwishedu 发表于 2019-4-24 20:51:49

9
sacromento 学生认证  发表于 2019-9-8 08:48:29
学习了,谢谢分享啊

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
jg-xs1
拉您进交流群
GMT+8, 2025-12-29 18:37