- Goal - Design xxx-system
- Assumptions:
- how many users: 100m active users
- Functional requirements:
- Non-functional requirements
- Load calculation
- Bottleneck
- Full Template
- Some domain concepts
- Question List
- ** Design a rate limiter**
- ** 分布式系统中, 怎么样生成全局唯一的 ID**
- ** Finding Frequent Items in Data Streams**
- 设计Calendar
- Design AWS CloudWatch
- Design ride share service (Uber, Lyft)
- [FB] Design a geo info system which provides service to find the nearest n locations from 50M point of interest
- Design a CDN network
- Design a Google document system
- Design a random ID generation system
- Design a key-value database
- Design Amazon’s sales ranking by category feature
- Design social network
- Design the Facebook news feed function
- Design the Facebook timeline function
- Design Twitter - high level
- Design Twitter Search
- Design the Twitter timeline function
- Design a trending topic system
- Design a search engine
- Design a recommendation system
- Design a tinyurl system
- Dropbox
- Design Youtube
- Yelp
- Design a garbage collection system
- Design the Facebook chat function
- Design the WhatsApp
- Design a cache system
- Design a hash map
- Design a least recently used cache
- Design a call center
- Design a deck of cards
- Design a parking lot
- Design a chat server
- ML的系统设计
- ML的系统设计题目
- References - System Design Interview Prep
Goal - Design xxx-system
Assumptions:
- a
- b
- xxx
how many users: 100m active users
Functional requirements:
- support xx million users
- user to do xx action
- real-time (redis in memory) / offline batch
- message queue for transaction sequence
- validation / monitoring
- accounts registration and payments
- rate limit for free accounts
- cache
- multi-language support
- US only or across the world (tax diff, rate diff)
Non-functional requirements
- scalable: 100m DAU
- available: database shard, monitor for auto-scaling (worker pool)
- performance
- security: external/internal user? 2-facto-login, firewall, encryt/decrpyt, user-based-access-rule, Lightweight directory access protocol (LDAP)
- consistency: strong consistency, CAP Theorem, read-heavy/write-heavy
- legal compliance
- cost
Load calculation
Avg Load:
100m * 20 days * 100 msg/day / (30 * 24 * 60 * 60) = 66k quest-per-second (QPS)
Peak load, 3 * avg load: 200k QPS
Bottleneck
- Single points of failure: LB
- Data replication
- Content Delivery Network (CDN)
- spike/peak of traffic
- Scalability: How can the system work for
10 times
more people? - cache: eviction policy, inconsistency
Full Template
Everything is a tradeoff !!!
- Business value and requirement [1 min]
-
FEATURE EXPECTATIONS [5 min]
- Use cases (Business feature / Tech feature)
- Scenarios that will not be covered
- Who will use
- How many will use
- Usage patterns
- Future trend in 5 or 10 years
-
ESTIMATIONS [5 min]
- Throughput (QPS for read and write queries)
- Latency expected from the system (for read and write queries)
- Read/Write ratio
- Traffic estimates
- Write (QPS, Volume of data)
- Read (QPS, Volume of data)
- Storage estimates
- Memory estimates
- If we are using a cache, what is the kind of data we want to store in cache (global cache, vs local cache, vs distributed cache)
- How much RAM and how many machines do we need for us to achieve this ?
- Amount of data you want to store in disk/ssd
- Security
- Measures to ensure blocking attack
- DDos, security patch
- Testing
- Unit test
- Integration test
- Regression test
- Performance test
- E2E testing
- Manual testing
- A/B test
- Maintenance/DevOps
- Infrastructure / host planning
- cost analysis
- Rollout plan (eg. multi-region, onebox=>prod)
- Deploy
- Rollback
- Disaster recovery
- Failover : Active-Passive(master-slave failover), Active-Active(master-master failover)
- Documentation/SOP/UserGuide
- Code review
- Git branch strategy
- Monitoring => constant improving plan
- Metrics
- Alarms
- Dashboard
- Analysis
- visualization
- ML model Internationalization, handling Engligh/French/more
-
DESIGN GOALS [5 min]
- Latency and Throughput requirements
-
Consistency vs Availability [Weak/strong/eventual => consistency Failover/replication => availability]
-
HIGH LEVEL DESIGN [5-10 min]
- APIs for Read/Write scenarios for crucial components
- Database schema
- Basic algorithm
- High level design for Read/Write scenario
-
DEEP DIVE [15-20 min]
- Scaling the algorithm
- Scaling individual components
- Availability, Consistency and Scale story for each component
- Consistency and availability patterns
- Think about the following components, how they would fit in and how it would help
- DNS
- CDN [Push vs Pull]
- Load Balancers [Active-Passive, Active-Active, Layer 4, Layer 7]
- Layer 4 load balancers look at info at the transport layer to decide (At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7)
- Layer 7 load balancers look at the application layer to decide
- LB methods:
- Least Connection Method
- Least Response Time Method
- Least Bandwidth Method
- Round Robin Method
- Weighted Round Robin Method
- IP Hash
-
Proxy
- Filter requests
- Log requests
- Transform requests (encryption, compression, etc)
- Cache
- Batch requests: Collapsed forwarding, Collapse requests for data
- Reverse Proxy
-
Forward vs reverse and apache’s explain
- Forward Proxy: Acting on behalf of a requestor (or service consumer)
- Reverse Proxy: Acting on behalf of service/content producer.
-
Forward vs reverse and apache’s explain
- Application layer scaling [Microservices, Service Discovery]
- DB [RDBMS, NoSQL]
- Consistent Hashing
- DB indexing
- RDBMS
- Master-slave, Master-master, Federation, Sharding, Denormalization, SQL Tuning
- ACID: Atomicity, consistency, isolation, durability
- NoSQL
- Key-Value, Wide-Column, Graph, Document
- Fast-lookups:
- RAM [Bounded size] => Redis, Memcached
- AP [Unbounded size] => Cassandra, RIAK, Voldemort
- CP [Unbounded size] => HBase, MongoDB, Couchbase, DynamoDB
-
Partitioning
- Methods: Horizontal partitioning, Vertical partitioning, Directory-based partitioning
- Algorithm: Key or hash-based, List partitioning, Round-robin, Composite partitioning(Combine any of above partitioning)
- Problems: Joins and denormalization, Referential integrity, Rebalancing
- Redundancy: special one “Shared-nothing architecture”
-
Caches
- Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
- Distributed cache / global cache
- Eviction policies
- Cache aside
- Write through
- Write behind
- Refresh ahead
- Cache order policies
- FIFO: first in first out
- LIFE: last in first out
- LRU: least recently used
- MRU: most recently used
- LFU: least frequently used
- RR: random replacement
- Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
- Asynchronism
- Message queues
- Task queues
- Back pressure
- AWS-SQS, RabbitMQ, Redis
- Message queues
-
Client-server Communication
- TCP
- UDP
- REST / socket
- Ajax-polling vs Long-Polling vs WebSockets vs Server-Sent Events
- Websocket: Full duplex communication, allows communication in both directions, and to happen simultaneously
- Ajax-polling vs Long-Polling vs WebSockets vs Server-Sent Events
- RPC
-
JUSTIFY [5 min]
- Throughput of each layer
- Latency caused between each layer
- Overall latency justification
Some domain concepts
- CAP
- CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made.
- Weak consistency
- After a write, reads may or may not see it. A best effort approach is taken.
- This approach is seen in systems such as memcached. Weak consistency works well in real time use cases such as VoIP, video chat, and realtime multiplayer games. For example, if you are on a phone call and lose reception for a few seconds, when you regain connection you do not hear what was spoken during connection loss.
- Eventual consistency
- (BASE (Basically Available, Soft state, Eventual consistency) semantics)
- After a write, reads will eventually see it (typically within milliseconds). Data is replicated asynchronously.
- This approach is seen in systems such as DNS and email. Eventual consistency works well in highly available systems.
- Strong consistency
- After a write, reads will see it. Data is replicated synchronously.
- This approach is seen in file systems and RDBMSes. Strong consistency works well in systems that need transactions.
Question List
** Design a rate limiter**
- [single machine] https://medium.com/@saisandeepmopuri/system-design-rate-limiter-and-data-modelling-9304b0d18250
- [distributed] https://konghq.com/blog/how-to-design-a-scalable-rate-limiting-algorithm/
- set up sticky sessions in your load balancer so that each consumer gets sent to exactly one node
- or , use a centralized data store such as Redis or Cassandra
- Or put a “lock” around the key in DB, at row level
- Pros and cons https://aaronice.gitbooks.io/system-design/system-design-problems/designing-an-api-rate-limiter.html
** 分布式系统中, 怎么样生成全局唯一的 ID**
** Finding Frequent Items in Data Streams**
设计Calendar
https://www.xiakexing.me/forum.php?mod=viewthread&tid=4994&highlight=Microsoft%2Bsenior
Feature包括,
-
book a meeting with others.
-
avoid meeting conflict.
-
suggest meeting time if conflict.
Calendar是ood,简单来说就是写三个database schema。
User schema
Id int64 PK
Name string
Date schema
Date Timestamp PK
UserId int64 PK
MeetingId Array
Meeting schema
Id int64 PK
AttendId Array
Start Timestamp
End Timestamp
.........Other fields
写一下sql,来解释如何detect meeting conflict, 以及有conflict后如何推荐meeting时间
Design AWS CloudWatch
**Design Payments system: **https://medium.com/airbnb-engineering/scaling-airbnbs-payment-platform-43ebfc99b324
Design ride share service (Uber, Lyft)
https://medium.com/@narengowda/uber-system-design-8b2bc95e2cfe
- driver rider matching: https://eng.lyft.com/matchmaking-in-lyft-line-691a1a32a008
[FB] Design a geo info system which provides service to find the nearest n locations from 50M point of interest
Design a CDN network
Reference:
Design a Google document system
Reference:
Design a random ID generation system
Reference:
Design a key-value database
Reference:
- Introduction to Redis.
- https://github.com/donnemartin/system-design-primer/blob/master/solutions/system_design/query_cache/README.md
- http://qnimate.com/overview-of-redis-architecture/
Design Amazon’s sales ranking by category feature
Design social network
Design the Facebook news feed function
Reference:
- What are best practices for building something like a News Feed?
- What are the scaling issues to keep in mind while developing a social network feed?
- Activity Feeds Architecture
Design the Facebook timeline function
Reference:
Design Twitter - high level
- Grokking
- Leetcode: 355. Design Twitter
Design Twitter Search
- Grokking
- Leetcode: 355. Design Twitter
Design the Twitter timeline function
Design a trending topic system
Reference:
- Implementing Real-Time Trending Topics With a Distributed Rolling Count Algorithm in Storm
- Early detection of Twitter trends explained
Design a function to return the top k requests during past time interval
Reference:
- Efficient Computation of Frequent and Top-k Elements in Data Streams
- An Optimal Strategy for Monitoring Top-k Queries in Streaming Windows
Design an online multiplayer card game
Reference:
- How to Create an Asynchronous Multiplayer Game
- How to Create an Asynchronous Multiplayer Game Part 2: Saving the Game State to Online Database
- How to Create an Asynchronous Multiplayer Game Part 3: Loading Games from the Database
- How to Create an Asynchronous Multiplayer Game Part 4: Matchmaking
- Real Time Multiplayer in HTML5
Design a graph search function
Reference:
- Building out the infrastructure for Graph Search
- Indexing and ranking in Graph Search
- The natural language interface of Graph Search and Erlang at Facebook.
Design a picture sharing system
Reference:
Design a search engine
Reference:
-
[请反复看多遍] Jeff Dean 2010 talk https://www.youtube.com/watch?v=modXC5IWTJI&list=WL&index=5&t=0s
- Important skill: ability to estimate performance of a system design - without actually having to build it !!!
- Numbers everyone should know: L1 cache 0.5ns, etc
- Version-1: Index server + doc server + cache server + ad-system
- Version-2: GoogleFileSystem
- Repo manager + query-server-tree + cache + spanner(https://cloud.google.com/spanner)
- Canary request https://www.quora.com/High-Availability-What-is-a-canary-request
- Important skill: ability to estimate performance of a system design - without actually having to build it !!!
- How would you implement Google Search?
- Implementing Search Engines
- http://www.ardendertat.com/2012/01/11/implementing-search-engines/
- Elastic-search https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up
Design a recommendation system
Reference:
Design a tinyurl system
Reference:
- System Design for Big Data-tinyurl
- Grokking
- URL Shortener API.
-
Similar: Pastebin (AmazonPaste)
- https://en.wikipedia.org/wiki/Pastebin
- A pastebin or text storage site is a type of online content hosting service where users can store plain text, e.g. to source code snippets for code review via Internet Relay Chat.
- Grokking
Dropbox
Design Youtube
Yelp
Design a garbage collection system
Reference:
Design a scalable web crawling system
Reference:
- Design and Implementation of a High-Performance Distributed Web Crawler
- https://github.com/donnemartin/system-design-primer/blob/master/solutions/system_design/web_crawler/README.md
- Grokking
Design the Facebook chat function
Reference:
Design the WhatsApp
Design a cache system
Reference:
Design a hash map
Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/hash_table/hash_map.ipynb
Design a least recently used cache
Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/lru_cache/lru_cache.ipynb
Design a call center
Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/call_center/call_center.ipynb
Design a deck of cards
Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/deck_of_cards/deck_of_cards.ipynb
Design a parking lot
Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/parking_lot/parking_lot.ipynb
Design a chat server
Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/online_chat/online_chat.ipynb
ML的系统设计
系统的流程(需求->结构->schema->scability->reliability, consistency)
Machine learning interview prep
分解步骤
- I. information and data:5分钟
- II. feature engineering:10分钟
- III. model:15分钟
- IV. evaluation:5分钟
- V. engineering: 10分钟
—Information and data
如果你被问到的问题就是 现在给你一个spotify,请给我设计一个音乐推荐系统, period.
你一定说巧妇难为无米之炊,我首先需要历史数据。
Input / output是什么
基本上最能用的历史数据可以被再细分为三个主要的table
- interaction table or log table, 即每一个点击,每一个购买,每一个评价,都是以一个(user, item) pair的格式记录下来的
(1) 每一条记录,是可以包含some contextual information的,比如时间戳,音乐听了多久,购买花了多少时间,等等,要动脑筋想想什么信息是有用的,尽量多log下来
(2) log可能不止记录了购买信息,还有查看信息,scroll down看到但未被点击信息,都是indicate different level of preference
-
dimention table: user, 即每一个user是有自己的metadata信息的,比如年龄,性别,地域等等demographic信息
-
dimension table: item, 和上面一样,只不过是item的metadata
我一般不会在这个stage对item做embedding即向量化,而是存最原始的raw数据
- High level direction: 1)基于已知question内容,推荐和用户兴趣相关 2)不关心内容,历史点击数据,用每个问题被点击的用户来做collaborative filter 3)一些Topic model的方法,按照topic tag分类
- 第一种approach,需要3个部分1)某一种embedding method去吧每个问题从document变成vector,典型的方法有bow 2)选择某种metrc 去计算特定个vector之间的距离,典型的有euclidean dist, cosine similarity 3)选择某种方法筛选出最相近的K个问题推荐给用户,最简单的是挨个计算distance,然后排序返还最相近的k个
- 更prefer eucliden还是cosine similarity?
—Feature Engineering + Modeling
之前把这两个分开的原因是面试官可能会分别问这两个的问题
但实际上,你的feature engineering一定是要为model服务的
这里给出我经常会使用的几种简单的推荐系统模型
(0). rule based model (难易程度1, make sense程度5)
(1). 转化成classification/regression 模型 (难易程度2, make sense程度5)
(2). matrix factorization (难易程度3, make sense程度2)
(3). factorization machine (难易程度3, make sense程度3)
(4). wide and deep learning (难易程度4, make sense程度4)
我个人最喜欢的,而且觉得比较稳妥的是(0)+(1),说的好是绝对没问题的
至于(2)(3)(4)网上有大把的资料,但你不能看一篇paper或者video就上了,一定要谨慎想好可能遇到的问题,比如
-
matrix factorization可能会问你怎么加进user and item metadata, 你的retrain plan是什么
-
wide and deep learning可能会问你deep part的神经网络怎么搭的,为什么
一句话追求fancy就要担着玩儿大了的风险
对于(0)来说,作为一个basis model有提到的价值,在处理cold start的时候是很有用的,但不可恋战一定尽快带过
rule无外乎根据两个heuristic
-
对user_i 找到和ta类似的user,看看别人买了什么,推荐给ta
-
对user_i买的item_j, 找到类似的东西,推荐给ta
这里面涉及的知识点就是怎么定义“类似”,即需要一个distance metric,这点大家去网上找吧,很多的similarity metrics
重点是(1),这里依然拿spotify music 推荐当例子那么如果转化成一个binary classification问题,我们就要拿出所有的music listening history log, 即tuple of (user, item, context),
比如(berserker888, Titanium, 2019-01-01:12:00:00)当做我们的positive case,即y值为1
那我听了这一首歌,就没有听剩下的所有歌,那我在这个context下所有没听的歌理论上都是negative case, 这里要知道unbalanced data的危害以及一种random sample negative case的逻辑。
同样可以把问题转化成regression问题,就像之前说的,我看了没有听,我听了,和我压根没看的东西代表着3种preference level.
下面简短启发一下feature的展开,user, item, context都是需要vectorize化的
最简单的就是直接join user, item dimension table拿到metadata。
但有的时候比如item是视频图像或者自然语言,就需要我们通过一些方法,比如pretrained deep learning model take bottleneck layer,请大家自行搜索word embedding和image embedding
稍微tricky一点的是contextual feature,这里可能需要aggregate log来达到目的,我建议大家提前想好一些feature,给一个例子:has user_i listened to item_j’s category in the past week at night?
- Data cleaning
- Data spill
- Pre-processing
- Post-processing
- Missing data:
- -If data size is large, can ignore.
- -Make missing data a new categorical feature.
- -If linear, take mean to fill in…
- Pvalue是什么
- 怎样判断和处理overfitting / underfitting
- @Note: more in https://blog.csdn.net/tz_zs/article/details/78588478
- 数据 特征 模型
- 过拟合的原因就类似
- 模型过于复杂(类似神经网络)
- no regularization
- 正则化技术还有:扩增样本集、早停止、Dropout、集成学习、多任务学习、对抗训练、参数共享等。
- epoch过大
- 数据量太小
- unbalanced data的危害 - 7 Techniques to Handle Imbalanced Data
-
- Use the right evaluation metrics
-
- Resample the training set
-
- Use K-fold Cross-Validation in the right way
-
- Ensemble different resampled datasets
-
- Resample with different ratios
-
- Cluster the abundant class
-
- Design your own models
-
- logistic regression的objective function, softmax的细节
- MLE / MAP 推导
- Maximum Likelihood Estimation, MLE是频率学派常用的估计方法
- Maximum A Posteriori, MAP是贝叶斯学派常用的估计方法!
- Feature engineering
- feature可能是sparse的,怎么处理
- Compute复杂度优化
- 可以分布式计算,例如10台计算机,每个分别找出k个最相近的,然后master node使用min heap 分路归并可以快速从10*K个最小的结果中找出最小的k个
- 升query速度, 基本上对每个没有incoming edge的node,算出所有这个node到可到node的汇率,存一下,computational complexity在O(N), query complexity在O(1)
- 怎么做model diagnosis
- How to debug model
- 优化模型
—Evaluation
一定要区分statistical metric和business metric
- 前者是你训练模型时候定义的metric,主要用来tune hyperparameter的,比如accuracy, F1-score…
- 后者定义你的模型是否有意义的metric,比如推荐歌单平均听的时长
一般后者意义更大,更会被问。因为是无法直接直接optimize的,只能通过ab testing才测试,所以也有可能会被问到一点点experiment design的知识
Model metrics
- AUC
- Log loss: cross-entropy
- F-score:
– ML engineering (ML Engineer 重点)
- Testing various aspects of the pipeline (data pre-processing and augmentation, input and output sanitization, model inference time).
- Data validation testing
- Model versioning, rollback, disaster recovery
- Alarm setup
- Setting up a distributed infrastructure to run training, hyperparameter search, or inference more efficiently.
- Note@Forecasting
- Forecast 的流程图
- Data ingestion
- Data validation
- Data pre-process: add, delete, regularization, normalization
- Mode training in Sagemaker
- Training notification via email/msg to owners
- Model approval before in prod
- Model deploy to prod, old model artifacts archive to s3
- Cloudwatch metrics to monitor traffic and model performance
- Alarms to auto-trigger rollback if errors/exceptions found
- Auto generate weekly review report
- Feedback section so users can submit suggestions, feature request, or bugs
- Time-series 算法
- Autoregressive Integrated Moving Average (ARIMA) Algorithm
- DeepAR+ Algorithm*
- arn:aws:forecast:::algorithm/Deep_AR_Plus
- Exponential Smoothing (ETS) Algorithm
- arn:aws:forecast:::algorithm/ETS
- Non-Parametric Time Series (NPTS) Algorithm
- arn:aws:forecast:::algorithm/NPTS
- Prophet Algorithm
- arn:aws:forecast:::algorithm/Prophet
- Forecast 的流程图
ML的系统设计题目
- [Yelp] 饭馆的推荐,涉及到了geolocation information
- [Instgram] Story推荐,每条Story是独一无二的并且是有时间性的
- [Spotify] 音乐推荐,怎么把音乐做个embedding
- [FB] news feed或者ads recommendation
- Facebook Newsfeed推荐,涉及到了不同user之前的networking
- [linkedin] On linkedin, create a metric so that you can rank activities (changing a job, sharing an article, having new connection and so on) and display the top k, next top k activities to to a user.
- [Google] 怎么设计一个ML system来解crossword puzzle
- [Quora] 怎么设计 一个给User推荐Question的系统
References - System Design Interview Prep
- Nail the System Design Interview - https://blog.tryexponent.com/how-to-nail-the-system-design-interview/
- How to prepare for the System Design Interview - https://www.educative.io/blog/how-to-prepare-system-design-interview
- 31 System Design Interview Questions - https://igotanoffer.com/blogs/tech/system-design-interviews
- LeetCode.ca design https://leetcode.ca/tags/#.Design
- design summary https://darktiantian.github.io/LeetCode%E7%AE%97%E6%B3%95%E9%A2%98%E6%95%B4%E7%90%86%EF%BC%88%E8%AE%BE%E8%AE%A1%E7%AF%87%EF%BC%89Design/
- https://github.com/donnemartin/system-design-primer
- https://www.youtube.com/watch?v=43bB7oSn190&ab_channel=Jordanhasnolife
- LeetCode Template: https://leetcode.com/discuss/career/229177/My-System-Design-Template
- https://aaronice.gitbooks.io/system-design/system-design-problems/designing-an-api-rate-limiter.html
- https://tianpan.co/notes/2016-02-13-crack-the-system-design-interview
- book Grokking-System-Design-Interview
- CRLS https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
.