Goal - Design xxx-system

Assumptions:

  1. a
  2. b
  3. xxx

how many users: 100m active users

Functional requirements:

  1. support xx million users
  2. user to do xx action
  3. real-time (redis in memory) / offline batch
  4. message queue for transaction sequence
  5. validation / monitoring
  6. accounts registration and payments
  7. rate limit for free accounts
  8. cache
  9. multi-language support
  10. US only or across the world (tax diff, rate diff)

Non-functional requirements

  1. scalable: 100m DAU
  2. available: database shard, monitor for auto-scaling (worker pool)
  3. performance
  4. security: external/internal user? 2-facto-login, firewall, encryt/decrpyt, user-based-access-rule, Lightweight directory access protocol (LDAP)
  5. consistency: strong consistency, CAP Theorem, read-heavy/write-heavy
  6. legal compliance
  7. cost

Load calculation

Avg Load: 100m * 20 days * 100 msg/day / (30 * 24 * 60 * 60) = 66k quest-per-second (QPS)

Peak load, 3 * avg load: 200k QPS

Bottleneck

  1. Single points of failure: LB
  2. Data replication
  3. Content Delivery Network (CDN)
  4. spike/peak of traffic
  5. Scalability: How can the system work for 10 times more people?
  6. cache: eviction policy, inconsistency

Full Template

Everything is a tradeoff !!!

  1. Business value and requirement [1 min]
  2. FEATURE EXPECTATIONS [5 min]
    1. Use cases (Business feature / Tech feature)
    2. Scenarios that will not be covered
    3. Who will use
    4. How many will use
    5. Usage patterns
    6. Future trend in 5 or 10 years
  3. ESTIMATIONS [5 min]
    1. Throughput (QPS for read and write queries)
    2. Latency expected from the system (for read and write queries)
    3. Read/Write ratio
    4. Traffic estimates
      1. Write (QPS, Volume of data)
      2. Read (QPS, Volume of data)
    5. Storage estimates
    6. Memory estimates
      1. If we are using a cache, what is the kind of data we want to store in cache (global cache, vs local cache, vs distributed cache)
      2. How much RAM and how many machines do we need for us to achieve this ?
      3. Amount of data you want to store in disk/ssd
    7. Security
      1. Measures to ensure blocking attack
      2. DDos, security patch
    8. Testing
      1. Unit test
      2. Integration test
      3. Regression test
      4. Performance test
      5. E2E testing
      6. Manual testing
      7. A/B test
    9. Maintenance/DevOps
      1. Infrastructure / host planning
      2. cost analysis
      3. Rollout plan (eg. multi-region, onebox=>prod)
      4. Deploy
      5. Rollback
      6. Disaster recovery
        1. Failover : Active-Passive(master-slave failover), Active-Active(master-master failover)
      7. Documentation/SOP/UserGuide
      8. Code review
      9. Git branch strategy
    10. Monitoring => constant improving plan
      1. Metrics
      2. Alarms
      3. Dashboard
      4. Analysis
      5. visualization
      6. ML model Internationalization, handling Engligh/French/more
  4. DESIGN GOALS [5 min]
    1. Latency and Throughput requirements
    2. Consistency vs Availability [Weak/strong/eventual => consistency Failover/replication => availability]
  5. HIGH LEVEL DESIGN [5-10 min]
    1. APIs for Read/Write scenarios for crucial components
    2. Database schema
    3. Basic algorithm
    4. High level design for Read/Write scenario
  6. DEEP DIVE [15-20 min]
    1. Scaling the algorithm
    2. Scaling individual components
      1. Availability, Consistency and Scale story for each component
      2. Consistency and availability patterns
    3. Think about the following components, how they would fit in and how it would help
      1. DNS
      2. CDN [Push vs Pull]
      3. Load Balancers [Active-Passive, Active-Active, Layer 4, Layer 7]
        1. Layer 4 load balancers look at info at the transport layer to decide (At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7)
        2. Layer 7 load balancers look at the application layer to decide
        3. LB methods:
          1. Least Connection Method
          2. Least Response Time Method
          3. Least Bandwidth Method
          4. Round Robin Method
          5. Weighted Round Robin Method
          6. IP Hash
      4. Proxy
        1. Filter requests
        2. Log requests
        3. Transform requests (encryption, compression, etc)
        4. Cache
        5. Batch requests: Collapsed forwarding, Collapse requests for data
      5. Reverse Proxy
        1. Forward vs reverse and apache’s explain
          1. Forward Proxy: Acting on behalf of a requestor (or service consumer)
          2. Reverse Proxy: Acting on behalf of service/content producer.
      6. Application layer scaling [Microservices, Service Discovery]
      7. DB [RDBMS, NoSQL]
        1. Consistent Hashing
        2. DB indexing
        3. RDBMS
          1. Master-slave, Master-master, Federation, Sharding, Denormalization, SQL Tuning
          2. ACID: Atomicity, consistency, isolation, durability
        4. NoSQL
          1. Key-Value, Wide-Column, Graph, Document
          2. Fast-lookups:
            1. RAM [Bounded size] => Redis, Memcached
            2. AP [Unbounded size] => Cassandra, RIAK, Voldemort
            3. CP [Unbounded size] => HBase, MongoDB, Couchbase, DynamoDB
      8. Partitioning
        1. Methods: Horizontal partitioning, Vertical partitioning, Directory-based partitioning
        2. Algorithm: Key or hash-based, List partitioning, Round-robin, Composite partitioning(Combine any of above partitioning)
        3. Problems: Joins and denormalization, Referential integrity, Rebalancing
      9. Redundancy: special one “Shared-nothing architecture
      10. Caches
        1. Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
          1. Distributed cache / global cache
        2. Eviction policies
          1. Cache aside
          2. Write through
          3. Write behind
          4. Refresh ahead
        3. Cache order policies
          1. FIFO: first in first out
          2. LIFE: last in first out
          3. LRU: least recently used
          4. MRU: most recently used
          5. LFU: least frequently used
          6. RR: random replacement
      11. Asynchronism
        1. Message queues
          1. Task queues
          2. Back pressure
          3. AWS-SQS, RabbitMQ, Redis
      12. Client-server Communication
        1. TCP
        2. UDP
        3. REST / socket
          1. Ajax-polling vs Long-Polling vs WebSockets vs Server-Sent Events
            1. Websocket: Full duplex communication, allows communication in both directions, and to happen simultaneously
        4. RPC
  7. JUSTIFY [5 min]
    1. Throughput of each layer
    2. Latency caused between each layer
    3. Overall latency justification

Some domain concepts

  1. CAP
    1. CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made.
    2. Weak consistency
      1. After a write, reads may or may not see it. A best effort approach is taken.
      2. This approach is seen in systems such as memcached. Weak consistency works well in real time use cases such as VoIP, video chat, and realtime multiplayer games. For example, if you are on a phone call and lose reception for a few seconds, when you regain connection you do not hear what was spoken during connection loss.
    3. Eventual consistency
      1. (BASE (Basically Available, Soft state, Eventual consistency) semantics)
      2. After a write, reads will eventually see it (typically within milliseconds). Data is replicated asynchronously.
      3. This approach is seen in systems such as DNS and email. Eventual consistency works well in highly available systems.
    4. Strong consistency
      1. After a write, reads will see it. Data is replicated synchronously.
      2. This approach is seen in file systems and RDBMSes. Strong consistency works well in systems that need transactions.

Question List

** Design a rate limiter**

** 分布式系统中, 怎么样生成全局唯一的 ID**

** Finding Frequent Items in Data Streams**

设计Calendar

https://www.xiakexing.me/forum.php?mod=viewthread&tid=4994&highlight=Microsoft%2Bsenior

Feature包括,

  1. book a meeting with others.

  2. avoid meeting conflict.

  3. suggest meeting time if conflict.

Calendar是ood,简单来说就是写三个database schema。

User schema

Id    int64 PK

Name    string

Date schema

Date    Timestamp    PK

UserId    int64    PK

MeetingId    Array

Meeting schema

Id    int64 PK

AttendId    Array

Start    Timestamp

End    Timestamp

.........Other fields

写一下sql,来解释如何detect meeting conflict, 以及有conflict后如何推荐meeting时间

Design AWS CloudWatch

**Design Payments system: **https://medium.com/airbnb-engineering/scaling-airbnbs-payment-platform-43ebfc99b324

Design ride share service (Uber, Lyft)

https://medium.com/@narengowda/uber-system-design-8b2bc95e2cfe

[FB] Design a geo info system which provides service to find the nearest n locations from 50M point of interest

Design a CDN network

Reference:

Design a Google document system

Reference:

Design a random ID generation system

Reference:

Design a key-value database

Reference:

Design Amazon’s sales ranking by category feature

Design social network

Design the Facebook news feed function

Reference:

Design the Facebook timeline function

Reference:

Design Twitter - high level

  • Grokking
  • Leetcode: 355. Design Twitter
  • Grokking
  • Leetcode: 355. Design Twitter

Design the Twitter timeline function

Reference:

Design a function to return the top k requests during past time interval

Reference:

Design an online multiplayer card game

Reference:

Design a graph search function

Reference:

Design a picture sharing system

Reference:

Design a search engine

Reference:

Design a recommendation system

Reference:

Design a tinyurl system

Reference:

Dropbox

Design Youtube

Yelp

Design a garbage collection system

Reference:

Design a scalable web crawling system

Reference:

Design the Facebook chat function

Reference:

Design the WhatsApp

Design a cache system

Reference:

Design a hash map

Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/hash_table/hash_map.ipynb

Design a least recently used cache

Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/lru_cache/lru_cache.ipynb

Design a call center

Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/call_center/call_center.ipynb

Design a deck of cards

Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/deck_of_cards/deck_of_cards.ipynb

Design a parking lot

Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/parking_lot/parking_lot.ipynb

Design a chat server

Solution https://github.com/donnemartin/system-design-primer/blob/master/solutions/object_oriented_design/online_chat/online_chat.ipynb

ML的系统设计

系统的流程(需求->结构->schema->scability->reliability, consistency)

Machine learning interview prep

分解步骤

  • I. information and data:5分钟
  • II. feature engineering:10分钟
  • III. model:15分钟
  • IV. evaluation:5分钟
  • V. engineering: 10分钟

—Information and data

如果你被问到的问题就是 现在给你一个spotify,请给我设计一个音乐推荐系统, period.

你一定说巧妇难为无米之炊,我首先需要历史数据。

Input / output是什么

基本上最能用的历史数据可以被再细分为三个主要的table

  1. interaction table or log table, 即每一个点击,每一个购买,每一个评价,都是以一个(user, item) pair的格式记录下来的

(1) 每一条记录,是可以包含some contextual information的,比如时间戳,音乐听了多久,购买花了多少时间,等等,要动脑筋想想什么信息是有用的,尽量多log下来

(2) log可能不止记录了购买信息,还有查看信息,scroll down看到但未被点击信息,都是indicate different level of preference

  1. dimention table: user, 即每一个user是有自己的metadata信息的,比如年龄,性别,地域等等demographic信息

  2. dimension table: item, 和上面一样,只不过是item的metadata

我一般不会在这个stage对item做embedding即向量化,而是存最原始的raw数据

  • High level direction: 1)基于已知question内容,推荐和用户兴趣相关 2)不关心内容,历史点击数据,用每个问题被点击的用户来做collaborative filter 3)一些Topic model的方法,按照topic tag分类
  • 第一种approach,需要3个部分1)某一种embedding method去吧每个问题从document变成vector,典型的方法有bow 2)选择某种metrc 去计算特定个vector之间的距离,典型的有euclidean dist, cosine similarity 3)选择某种方法筛选出最相近的K个问题推荐给用户,最简单的是挨个计算distance,然后排序返还最相近的k个
    • 更prefer eucliden还是cosine similarity?

—Feature Engineering + Modeling

之前把这两个分开的原因是面试官可能会分别问这两个的问题

但实际上,你的feature engineering一定是要为model服务的

这里给出我经常会使用的几种简单的推荐系统模型

(0). rule based model (难易程度1, make sense程度5)

(1). 转化成classification/regression 模型 (难易程度2, make sense程度5)

(2). matrix factorization (难易程度3, make sense程度2)

(3). factorization machine (难易程度3, make sense程度3)

(4). wide and deep learning (难易程度4, make sense程度4)

我个人最喜欢的,而且觉得比较稳妥的是(0)+(1),说的好是绝对没问题的

至于(2)(3)(4)网上有大把的资料,但你不能看一篇paper或者video就上了,一定要谨慎想好可能遇到的问题,比如

  • matrix factorization可能会问你怎么加进user and item metadata, 你的retrain plan是什么

  • wide and deep learning可能会问你deep part的神经网络怎么搭的,为什么

一句话追求fancy就要担着玩儿大了的风险

对于(0)来说,作为一个basis model有提到的价值,在处理cold start的时候是很有用的,但不可恋战一定尽快带过

rule无外乎根据两个heuristic

  • 对user_i 找到和ta类似的user,看看别人买了什么,推荐给ta

  • 对user_i买的item_j, 找到类似的东西,推荐给ta

这里面涉及的知识点就是怎么定义“类似”,即需要一个distance metric,这点大家去网上找吧,很多的similarity metrics

重点是(1),这里依然拿spotify music 推荐当例子那么如果转化成一个binary classification问题,我们就要拿出所有的music listening history log, 即tuple of (user, item, context),

比如(berserker888, Titanium, 2019-01-01:12:00:00)当做我们的positive case,即y值为1

那我听了这一首歌,就没有听剩下的所有歌,那我在这个context下所有没听的歌理论上都是negative case, 这里要知道unbalanced data的危害以及一种random sample negative case的逻辑。

同样可以把问题转化成regression问题,就像之前说的,我看了没有听,我听了,和我压根没看的东西代表着3种preference level.

下面简短启发一下feature的展开,user, item, context都是需要vectorize化的

最简单的就是直接join user, item dimension table拿到metadata。

但有的时候比如item是视频图像或者自然语言,就需要我们通过一些方法,比如pretrained deep learning model take bottleneck layer,请大家自行搜索word embedding和image embedding

稍微tricky一点的是contextual feature,这里可能需要aggregate log来达到目的,我建议大家提前想好一些feature,给一个例子:has user_i listened to item_j’s category in the past week at night?

  • Data cleaning
    • Data spill
    • Pre-processing
    • Post-processing
  • Missing data:
    • -If data size is large, can ignore.
    • -Make missing data a new categorical feature.
    • -If linear, take mean to fill in…
  • Pvalue是什么
  • 怎样判断和处理overfitting / underfitting
    • @Note: more in https://blog.csdn.net/tz_zs/article/details/78588478
    • 数据 特征 模型
    • 过拟合的原因就类似
      • 模型过于复杂(类似神经网络)
      • no regularization
        • 正则化技术还有:扩增样本集、早停止、Dropout、集成学习、多任务学习、对抗训练、参数共享等。
      • epoch过大
      • 数据量太小
  • unbalanced data的危害 - 7 Techniques to Handle Imbalanced Data
      1. Use the right evaluation metrics
      1. Resample the training set
      1. Use K-fold Cross-Validation in the right way
      1. Ensemble different resampled datasets
      1. Resample with different ratios
      1. Cluster the abundant class
      1. Design your own models
  • logistic regression的objective function, softmax的细节
  • MLE / MAP 推导
    • Maximum Likelihood Estimation, MLE是频率学派常用的估计方法
    • Maximum A Posteriori, MAP是贝叶斯学派常用的估计方法!
  • Feature engineering
  • Compute复杂度优化
    • 可以分布式计算,例如10台计算机,每个分别找出k个最相近的,然后master node使用min heap 分路归并可以快速从10*K个最小的结果中找出最小的k个
  • 升query速度, 基本上对每个没有incoming edge的node,算出所有这个node到可到node的汇率,存一下,computational complexity在O(N), query complexity在O(1)
  • 怎么做model diagnosis
  • How to debug model
  • 优化模型

—Evaluation

一定要区分statistical metric和business metric

  • 前者是你训练模型时候定义的metric,主要用来tune hyperparameter的,比如accuracy, F1-score…
  • 后者定义你的模型是否有意义的metric,比如推荐歌单平均听的时长

一般后者意义更大,更会被问。因为是无法直接直接optimize的,只能通过ab testing才测试,所以也有可能会被问到一点点experiment design的知识

Model metrics

  • AUC
  • Log​ ​loss:​ ​cross-entropy
  • F-score:

– ML engineering (ML Engineer 重点)

  • Testing various aspects of the pipeline (data pre-processing and augmentation, input and output sanitization, model inference time).
    • Data validation testing
  • Model versioning, rollback, disaster recovery
    • Alarm setup
  • Setting up a distributed infrastructure to run training, hyperparameter search, or inference more efficiently.
  • Note@Forecasting
    • Forecast 的流程图
      • Data ingestion
      • Data validation
      • Data pre-process: add, delete, regularization, normalization
      • Mode training in Sagemaker
      • Training notification via email/msg to owners
      • Model approval before in prod
      • Model deploy to prod, old model artifacts archive to s3
      • Cloudwatch metrics to monitor traffic and model performance
      • Alarms to auto-trigger rollback if errors/exceptions found
      • Auto generate weekly review report
      • Feedback section so users can submit suggestions, feature request, or bugs
    • Time-series 算法
      • Autoregressive Integrated Moving Average (ARIMA) Algorithm
      • DeepAR+ Algorithm*
        • arn:aws:forecast:::algorithm/Deep_AR_Plus
      • Exponential Smoothing (ETS) Algorithm
        • arn:aws:forecast:::algorithm/ETS
      • Non-Parametric Time Series (NPTS) Algorithm
        • arn:aws:forecast:::algorithm/NPTS
      • Prophet Algorithm
        • arn:aws:forecast:::algorithm/Prophet

ML的系统设计题目

  • [Yelp] 饭馆的推荐,涉及到了geolocation information
  • [Instgram] Story推荐,每条Story是独一无二的并且是有时间性的
  • [Spotify] 音乐推荐,怎么把音乐做个embedding
  • [FB] news feed或者ads recommendation
    • Facebook Newsfeed推荐,涉及到了不同user之前的networking
  • [linkedin] On linkedin, create a metric so that you can rank activities (changing a job, sharing an article, having new connection and so on) and display the top k, next top k activities to to a user.
  • [Google] 怎么设计一个ML system来解crossword puzzle
  • [Quora] 怎么设计 一个给User推荐Question的系统

References - System Design Interview Prep

  1. Nail the System Design Interview - https://blog.tryexponent.com/how-to-nail-the-system-design-interview/
  2. How to prepare for the System Design Interview - https://www.educative.io/blog/how-to-prepare-system-design-interview
  3. 31 System Design Interview Questions - https://igotanoffer.com/blogs/tech/system-design-interviews
  4. LeetCode.ca design https://leetcode.ca/tags/#.Design
  5. design summary https://darktiantian.github.io/LeetCode%E7%AE%97%E6%B3%95%E9%A2%98%E6%95%B4%E7%90%86%EF%BC%88%E8%AE%BE%E8%AE%A1%E7%AF%87%EF%BC%89Design/
  6. https://github.com/donnemartin/system-design-primer
  7. https://www.youtube.com/watch?v=43bB7oSn190&ab_channel=Jordanhasnolife
  8. LeetCode Template: https://leetcode.com/discuss/career/229177/My-System-Design-Template
  9. https://aaronice.gitbooks.io/system-design/system-design-problems/designing-an-api-rate-limiter.html
  10. https://tianpan.co/notes/2016-02-13-crack-the-system-design-interview
  11. book Grokking-System-Design-Interview
  12. CRLS https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/

.