[学习笔记]System Design 1 - Approaches and Trade-offs

1关注
3粉丝

已卖：70份资源

学科带头人

54%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 13005 个
通用积分: 409.9229
学术水平: 109 点
热心指数: 112 点
信用等级: 103 点
经验: 71218 点
帖子: 1079
精华: 0
在线时间: 1538 小时
注册时间: 2016-7-19
最后登录: 2024-6-8

楼主

liuxf666 发表于 2019-3-31 22:05:34 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

1. How to approach a system design Step 1: Outline use cases, constraints, and assumptionsGather requirements and scope the problem.

Who is going to use it?
How are they going to use it?
How many users are there?
What does the system do?
What are the inputs and outputs of the system?
How much data do we expect to handle?
How many requests per second do we expect?
What is the expected read to write ratio?

Step 2: Create a high level designOutline a high level design with all important components.

Sketch the main components and connections
Justify the ideas

Step 3: Design core componentsDive into details for each core component.
Step 4: Scale the designIdentify and address bottlenecks, given the constraints. For example, do you need the following to address scalability issues?

Load balancer
Horizontal scaling
Caching
Database sharding

Discuss potential solutions and trade-offs. Everything is a trade-off. Address bottlenecks using principles of scalable system design.

2. System Design trade-offs2.1 Performance vs scalabilityA service is scalable if it results in increased performance in a manner proportional to resources added. Generally, increasing performance means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow.
Another way to look at performance vs scalability:

If you have a performance problem, your system is slow for a single user.
If you have a scalability problem, your system is fast for a single user but slow under heavy load.

Source(s) and further reading

2.2 Latency vs throughputLatency is the time to perform some action or to produce some result.
Throughput is the number of such actions or results per unit of time.
Generally, you should aim for maximal throughput with acceptable latency.
Source(s) and further reading

Understanding latency vs throughput

2.3 Availability vs consistencyIn a distributed computer system, you can only support two of the following guarantees:

Consistency - Every read receives the most recent write or an error
Availability - Every request receives a response, without guarantee that it contains the most recent version of the information
Partition Tolerance - The system continues to operate despite arbitrary partitioning due to network failures

Networks aren't reliable, so you'll need to support partition tolerance.  You'll need to make a software tradeoff between consistency and availability.
CP - consistency and partition toleranceWaiting for a response from the partitioned node might result in a timeout error.  CP is a good choice if your business needs require atomic reads and writes.
AP - availability and partition toleranceResponses return the most recent version of the data available on a node, which might not be the latest.  Writes might take some time to propagate when the partition is resolved.
AP is a good choice if the business needs allow for eventual consistency or when the system needs to continue working despite external errors.
Source(s) and further reading

2.3.1 Consistency patternsWith multiple copies of the same data, we are faced with options on how to synchronize them so clients have a consistent view of the data.  Recall the definition of consistency from the CAP theorem - Every read receives the most recent write or an error.
#1 Weak consistencyAfter a write, reads may or may not see it.  A best effort approach is taken.
This approach is seen in systems such as memcached.  Weak consistency works well in real time use cases such as VoIP, video chat, and realtime multiplayer games.  For example, if you are on a phone call and lose reception for a few seconds, when you regain connection you do not hear what was spoken during connection loss.
#2 Eventual consistencyAfter a write, reads will eventually see it (typically within milliseconds).  Data is replicated asynchronously.
This approach is seen in systems such as DNS and email.  Eventual consistency works well in highly available systems.
#3 Strong consistencyAfter a write, reads will see it.  Data is replicated synchronously.
This approach is seen in file systems and RDBMSes.  Strong consistency works well in systems that need transactions.
Source(s) and further reading

Transactions across data centers

2.3.2 Availability patternsThere are two main patterns to support high availability: fail-over and replication.
#1 Fail-overActive-passiveWith active-passive fail-over, heartbeats are sent between the active and the passive server on standby.  If the heartbeat is interrupted, the passive server takes over the active's IP address and resumes service.
The length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby.  Only the active server handles traffic.
Active-passive failover can also be referred to as master-slave failover.
Active-activeIn active-active, both servers are managing traffic, spreading the load between them.
If the servers are public-facing, the DNS would need to know about the public IPs of both servers.  If the servers are internal-facing, application logic would need to know about both servers.
Active-active failover can also be referred to as master-master failover.
Disadvantage(s): failover

Fail-over adds more hardware and additional complexity.
There is a potential for loss of data if the active system fails before any newly written data can be replicated to the passive.

#2 ReplicationMaster-slave and master-masterTO be discussed in detail later.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏1 回帖

关键词：word

本帖被以下文库推荐

· 学道会最美学习笔记|主题: 8116, 订阅: 89

沙发

经管之家编辑部

发表于 2019-3-31 22:11:45

为您点赞！

藤椅

充实每一天 发表于 2019-3-31 22:22:15 来自手机

板凳

hifinecon 发表于 2019-3-31 23:54:47

报纸

cdc0215 发表于 2019-4-1 00:39:42

为您点赞！

地板

artra2012

发表于 2019-4-1 18:38:36

为您点赞！！！

7楼

珍惜点滴

发表于 2019-4-1 18:52:15

向您学习，点赞

8楼

sulight

发表于 2019-4-1 20:07:23

谢谢分享，
学习心得：
Active-activeIn active-active, both servers are managing traffic, spreading the load between them.
If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers.
Active-active failover can also be referred to as master-master failover.
Disadvantage(s): failover
Fail-over adds more hardware and additional complexity.
There is a potential for loss of data if the active system fails before any newly written data can be replicated to the passive.
#2 ReplicationMaster-slave and master-masterTO be discussed in detail later.

[学习笔记] [学习笔记]System Design 1 - Approaches and Trade-offs [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

一级伯乐勋章

初级学术勋章

中级学术勋章

初级热心勋章

中级热心勋章

初级信用勋章

中级信用勋章

20周年荣誉勋章

本版微信群

[学习笔记] [学习笔记]System Design 1 - Approaches and Trade-offs [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

一级伯乐勋章

初级学术勋章

中级学术勋章

初级热心勋章

中级热心勋章

初级信用勋章

中级信用勋章

20周年荣誉勋章

本版微信群

扫码加我拉你入群