system design step by step!
This is the Day 1 of writing about system design. The content is inspired by the "Grokking Modern System Design" series.
Step 1: Requirement Clarifications
Try to justify the scope of the problem. Design rounds are completely subjective. There is no one correct answer.
But, there is definitely some idea in the interviewer's mind. Try to capture some of it by asking questions and note it down.
Questions like:
- Can users follow each other?
- What can a user share (photos, videos etc.)
- Should we focus only on the backend?
- Should there be a search functionality?
All of these requirements collectively decide what out final design looks like.
Step 2: System Interface Definition
List down the APIs that you need to develop. This is essentially the beginning of the actual design and data model.
postTweet(user_id, tweet_data, tweet_location, timestamp...)
generateTimeline(user_id, current_time, location...)
you get the idea.
Step 3: Back of the envelope estimation
This is where the scale kicks in. The scale decides your tradeoffs.
- What are the number of tweets/day
- How much storage do we need?
- Network Bandwidth needed? --> try to mention this
Step 4: Data Model
Data modelling essentially defines how the data flows through your application. It helps deciding how to manage and partition data.
There should be clear definition for each entity, how they interact etc.
User: UserID, Name, Email, DOB, CreationDate, LastLogin, etc.
Tweet: TweetID, Content, Location, [Tags] etc.
UserFollow: UserID1, UserID2
Which DB should we choose --> Step 1 and 2 should give you the idea.
Step 4: HLD
You need to draw the design. A basic design would look something like this:
client | Load Balancer | [Server1, Server2, ... ServerN] | DB/FileStorage/Cache
Step 6: Detailed Design
Discuss every component in detail here (tradeoffs etc).
- Should we partition out data?
- What should be the ordering in the timeline?
- What components need load balancing?
- At which layer should we introduce the cache?
Step 7: Identifying and resolving bottlenecks
Try to discuss bottlenecks and ways to handle them?
- High Availability
- Data persistence
- High traffic
- Monitoring
All the best!