• CHAPTER 8: DESIGN A URL SHORTENER


    Step 1 - Understand the problem and establish design scope

    Candidate: Can you give an example of how a URL shortener work?
    Interviewer: Assume URL
    https://www.systeminterview.com/q=chatsystem&c=loggedin&v=v3&l=long is the original
    URL. Your service creates an alias with shorter length: https://tinyurl.com/ y7keocwj. If you
    click the alias, it redirects you to the original URL.
    Candidate: What is the traffic volume?
    Interviewer: 100 million URLs are generated per day.
    Candidate: How long is the shortened URL?
    Interviewer: As short as possible.
    Candidate: What characters are allowed in the shortened URL?
    Interviewer: Shortened URL can be a combination of numbers (0-9) and characters (a-z, AZ).
    Candidate: Can shortened URLs be deleted or updated?
    Interviewer: For simplicity, let us assume shortened URLs cannot be deleted or updated.
    Here are the basic use cases:
    1.URL shortening: given a long URL => return a much shorter URL
    2.URL redirecting: given a shorter URL => redirect to the original URL
    3.High availability, scalability, and fault tolerance considerations

    Back of the envelope estimation

    • Write operation: 100 million URLs are generated per day.
    • Write operation per second: 100 million / 24 /3600 = 1160
    • Read operation: Assuming ratio of read operation to write operation is 10:1, read
    operation per second: 1160 * 10 = 11,600
    • Assuming the URL shortener service will run for 10 years, this means we must support
    100 million * 365 * 10 = 365 billion records.
    • Assume average URL length is 100.
    • Storage requirement over 10 years: 365 billion * 100 bytes * 10 years = 365 TB
    It is important for you to walk through the assumptions and calculations with your
    interviewer so that both of you are on the same page.

    Step 2 - Propose high-level design and get buy-in

    API Endpoints

    1.URL shortening. To create a new short URL, a client sends a POST request, which contains
    one parameter: the original long URL. The API looks like this:
    POST api/v1/data/shorten
    • request parameter: {longUrl: longURLString}
    • return shortURL
    2.URL redirecting. To redirect a short URL to the corresponding long URL, a client sends a
    GET request. The API looks like this:
    GET api/v1/shortUrl
    • Return longURL for HTTP redirection

    URL redirecting

    在这里插入图片描述
    Figure 8-1 shows what happens when you enter a tinyurl onto the browser. Once the server receives a tinyurl request, it changes the short URL to the long URL with 301 redirect.
    在这里插入图片描述

    301 redirect. A 301 redirect shows that the requested URL is “permanently” moved to the long URL. Since it is permanently redirected, the browser caches the response, and subsequent requests for the same URL will not be sent to the URL shortening service. Instead, requests are redirected to the long URL server directly.
    302 redirect. A 302 redirect means that the URL is “temporarily” moved to the long URL, meaning that subsequent requests for the same URL will be sent to the URL shortening service first. Then, they are redirected to the long URL server.

    The most intuitive way to implement URL redirecting is to use hash tables.

    URL shortening

    在这里插入图片描述
    • Each longURL must be hashed to one hashValue.
    • Each hashValue can be mapped back to the longURL.

    Step 3 - Design deep dive

    Data model

    In the high-level design, everything is stored in a hash table.
    A better option is to store mapping in a relational
    database.
    在这里插入图片描述

    Hash function

    The hashValue consists of characters from [0-9, a-z, A-Z], containing 10 + 26 + 26 = 62 possible characters. To figure out the length of hashValue, find the smallest n such that 62^n ≥ 365 billion.
    在这里插入图片描述
    在这里插入图片描述
    even the shortest hash value (from CRC32) is too long (more than 7
    characters). How can we make it shorter?
    在这里插入图片描述
    still too expensive, use bloom filter(method to test if something is in a set)

    Base 62 conversion

    在这里插入图片描述

    在这里插入图片描述

    URL shortening deep dive

    在这里插入图片描述

    The distributed unique ID generator is worth mentioning.

    URL redirecting deep dive

    As there are more reads than writes, mapping is stored in a cache to improve performance.
    在这里插入图片描述

    Step 4 - Wrap up

    If there is extra time at the end of the interview, here are a few additional talking points.
    • Rate limiter: A potential security problem we could face is that malicious users send an overwhelmingly large number of URL shortening requests. Rate limiter helps to filter out requests based on IP address or other filtering rules. If you want to refresh your memory
    about rate limiting, refer to “Chapter 4: Design a rate limiter”.
    • Web server scaling: Since the web tier is stateless, it is easy to scale the web tier by adding or removing web servers.
    • Database scaling: Database replication and sharding are common techniques.
    • Analytics: Data is increasingly important for business success. Integrating an analytics solution to the URL shortener could help to answer important questions like how many people click on a link? When do they click the link? etc.
    • Availability, consistency, and reliability. These concepts are at the core of any large
    system’s success. We discussed them in detail in Chapter 1, please refresh your memory on these topics.

  • 相关阅读:
    Linux 可执行文件瘦身指令 strip 使用示例
    向量数据库Milvus Cloud 核心组件Knowhere升级,支持 GPU 索引和 Cosine 相似性类型
    基于Java的学校固定资产管理系统设计与实现(源码+lw+部署文档+讲解等)
    面试复习整理
    Fiori VS code 连接配置本地SDK yaml 文件配置
    dbca静默安装及建库
    记:谷歌开发者大会2022——共码未来
    迈动互联中标北京人寿保险,助推客户提升品牌价值
    【计算机网络】 IP协议格式以及以太网帧结构
    五个分层维度:SpringBoot工程分层实战
  • 原文地址:https://blog.csdn.net/HuiFeiDeTuoNiaoGZ/article/details/133117599