| 名称 | HyperText Transfer Protocol |
| 概念 | an application layer protocol used to transfer web pages from a server to a client. |
| 组成 | HTML file, a JPEG image,an audio clip, an applet that are addressed by a single URL |
| HTML document & several other referenced objects |
| server | client | |
| 概念 | piece of software that exposes a listening socket, wait for requests | piece of software that initiates a request |
| services requests and provide a response | receives and processes the response | |
| Accessing a web page | send a response containing (html document, other referenced resources) | send a request for the resource/document (webpage) addressed by ONE URL |
| extract user data, perform task and send response | marshall user information send a request containing this to server-side program parse response |
| 背景 | an application-layer protocol | |
| rely on many other protocols TCP (provides reliable in-order delivery) IP (delivers data packets between the end hosts) Layer2 (deal with the individual networks MAC) | ||
| OSI reference model | Application | delivers services and applications deal with the advanced application-specific functionality HTTP, FTP, SMTP |
| Presentation | JPEG, GIF, MPFG | |
| Session | AppleTalk, Winsock | |
| Transport | delivers data segments between processes deal with things like reliability, flow control TCP, UDP, SPX | |
| Network | delivers data packets between hosts. access multiple networks IP, ICMP, IPX | |
| Data link | Ethernet, ATM | |
| Physical | Ethernet, Token Ring | |
| routing information | ![]() | |
| 关系 | HTTP is built over TCP: Socket socket = new socket("10.0.0.1",80); |
| TCP makes life easier for HTTP - no packet loss, no out-of-order delivery, congestion/flow control automatically handled | |
| TCP allows HTTP to focus on its own functionality | |
| TCP | 全称:Transmission Control Protocol |
| 特点: connection-oriented protocol; 面向连接 provides a reliable unicast end-to-end byte stream over an unreliable internetwork 单向,不可靠 | |
| Connec tion-oriented | 含义:Before any data transfer, TCP establishes a connection - one TCP entity is waiting for a connection("server"), the other TCP entity("client") contacts the server |
| 发生:when you create a new Socket | |
| Reliable | Byte stream broken up into chunks called segments: receiver sends ACKs for segments; TCP maintains a timer, the segement is retransmitted if an ACK is not received in time. |
| Detecting errors: TCP has checksums for header and data. Segments with invalid checksums are discarded. Eack byte that is transmitted has a sequence number. | |
| Byte stream service | TCP deals with segments: send a segment, resend the segment if it's lost To higher layers, TCP exposes a byte stream service |
| TCP format | ![]() |
| 定义 | a way of locating a resource on the Internet |
| 组成 | protocol://hostname[:port]/path/filename#section protocol: used to access the server hostname[:port]: the name of the server(if no port, automatically uses the default for protocol) /path/filename#section: the location of a file on the server |
| filename #section | 作用: - points to a file in the directory specified by path - If omitted, it is left to the server to decide which file to send. (It may send an index of the directory, often in a file called index.html) 索引文件 |
| section: - 含义:a name anchor (fragment/Ref) in an HTML document 标记或链接目标位置的方式 - 使用:create using a tag ... | |
| other possible protocols to use with URLs | ![]() |
| 作用 | - specifiy how a client & server establish a connection - how client requests data from server - how server responds to requests - how a connection is closed |
| 特点: stateless | - doesn't remember anything about previous connections, it's simple and robust. - can lead to inefficiencies |
| 初始化 connection |
|
| HTTP step by step |
|
| GET | Query string incorporated in the request URL |
| Idempotent: multiple requests have the same effect as a single one | |
| cachable | |
| POST | Query string placed in the body of the HTTP request |
| Non-idempotent | |
| used when (want to alter data on the server-side) |
- An HTTP GET message request
- --GET请求消息
-
- GET /somedir/index.html HTTP/1.1
- --向服务器请求某个目录下的index.html文件,并使用HTTP 1.1版本进行通信
-
- Host: www.qmul.ac.uk
- --指定请求的目标服务器的主机名和端口号(www.qmul.ac.uk)
- Connection: close
- --请求完成后是否关闭与服务器之间的连接;Doesn’t want to use persistent connections
- User-agent: Mozilla/5.0
- --发送请求的客户端的应用程序类型或操作系统信息(Mozilla/5.0);a Netscape browser
- Accept-language:fr
- --客户端所支持的语言类型(fr);Prefer to receive a French version if such a version exists
- extra carriage return and line feed
- --请求消息头部的结束
-
- HTTP/1.1 200 OK
- --HTTP响应消息,状态码为200表示请求成功;Request succeeded and the information is in the response.
-
- Connection : close
- --服务器在响应完成后是否关闭与客户端之间的连接;Server is going to close the TCP connection
- Date: Fri, 10th Nov 2000 12:01:14 GMT
- --响应消息的生成时间
- Server: Apache/1.3.0 (Unix)
- --响应消息中提供了服务器的软件信息(Apache/1.3.0 (Unix))
- Last-Modified: Mon, 20 July 1999 08:44:01 GMT
- --所请求的资源最后修改的日期和时间
- Content-Length: 5993
- --响应消息主体的长度(5993字节)
- Content-Type: text/html
- --响应消息主体的数据类型(text/html,即文本/HTML)
- (data data data ...)
- --在响应消息的正文部分被省略为"(data data data ...)",该部分应该是实际的响应数据
method (get/post) | path | version (HTTP/1.1)
header field name value
...
entity body (Form name-value pairs if POST, not used if GET)
version (HTTP/1.1) | status code (200 /400)| phrase(OK/Bad request)
header field name value
...
entity body (Form name-value pairs if POST, not used if GET)
| 前提 | each resource (on a server) required a separate TCP session |
| 问题 | there are also persistent connections: - Server leaves a TCP connection open (for some time) after sending a response. |
| 解决 | Subsequent requests and responses between same client and server can be made over same connection. |
| 方式 | With pipelining: – Usually multiple resources are obtained by parallel TCP connections – Speeds up downloads for complex web pages |
| Without pipelining: – All the referred requests are sent back-to-back, leading to only one round trip for all the referred to objects. | |
![]() |

- GET /path/file.html HTTP/1.1
- Connection: keep-alive
- User-Agent: HTTPTool/1.1
-
- HTTP/1.1 200 OK
- Date: Fri, 31 Dec 1999 23:59:59 GMT
- Content-Type: text/html
- Content-Length: 1354
- <html>
- <body>
- <h1>Happy New Year!</h1>
- (more file contents) . . .
- </body>
- </html>
-
- • Client request:
- GET / HTTP/1.1
- Host: www.google.com
- (Followed by a new line, in the form of a carriage return followed by a line feed.)
-
- • Server response:
- HTTP/1.1 200 OK
- Content-Length: 3059
- Server: GWS/2.0
- Date: Sat, 11 Jan 2003 02:44:04 GMT
- Content-Type: text/html
- Cache-control: private
- Set-Cookie:
- PREF=ID=73d4aef52e57bae9:TM=1042253044:LM=1042253044:S=SMCc_HRPCQiqy X9j; expires=Sun, 17-Jan-2038 19:14:07 GMT;
- path=/; domain=.google.com
- Connection: keep-alive
- (Followed by a blank line and HTML text comprising the Google home page.)
- <HTML><body> ...
| Cache-Control 缓存控制 | Holds instructions for caching in both requests and responses |
| Etag 实体标签 | an identifier for a specific version of a resource |
| Vary变量 | Allows to determine if a cached response may be returned for a subsequent request |
| Date | Shows the timestamp of when the response was generated |
| Expires | Shows the time that the resource expires |
| Pragma | Similar to cache-control (e.g. often used to disable caching) |
| Content-Length | Shows the length of the resource in bytes |
| Content-Encoding | Describes how the content is encoded, e.g. gzip |
| Content-Type | MIME type of object, e.g. text/html |
| 4 components to consider | cookie header line of HTTP response message |
| cookie header line in HTTP request message | |
| cookie file kept on user’s host, managed by user’s browser | |
| backend database at website | |
| Example | – Susan always accesses the internet from her PC. – Assume that she visits a specific e-commerce site for the first time. – When initial HTTP request arrives at site, site creates: Ⅰ unique ID;Ⅱ entry in backend database for ID |
| 简介 | 全称 | HTTP Secure |
| &HTTP | The same as HTTP but runs over TLS (Transport Layer Security) , Port 443 | |
| 特点 | - All traffic (headers and payloads) are encrypted - Authenticates server - Prevents other from sniffing traffic | |
| HTTPS URLs | - Near-identical to HTTP URLs - The protocol changes : From http:// to https:// | |
| HTTPS Issues | Adds extra overhead - you should only use HTTPS when necessary - Some organisations are pushing for all HTTP to be encrypted | |
| - Increases connection setup time - Requires TCP setup + TLS handshake | ||
| Thus, increases page load time Not good for requesting small resources | ||
| How Secure is HTTPS | - The security of HTTPS depends on that of the underlying TLS protocol - A website that uses mixed protocols (e.g., images served via HTTP, login info via HTTPS) can still make the user vulnerable to attacks/surveillance | |
| 简介 | HTTP 2.0 has had a focus on reducing page load times |
| SPDY | Based on Google’s SPDY |
| Protocol developed by Google由Google开发的协议 | |
| Deployed on Google serversGoogle服务器部署了该协议 – Also supported by Twitter, Facebook, imgur, Blogspot也支持该协议 | |
| Taken by the IETF and principles pushed into HTTP 2.0 IETF接管了SPDY,并将其原则纳入到HTTP 2.0中 | |
| SPDY and HTTP 2.0 | Multiplexing多路复用 Multiple resources can be requested and fetched in parallel 可以并行请求和获取多个资源 Prevents “head of line” blocking防止"队头阻塞" |
| Universal encryption统一加密 All traffic is encrypted by default默认情况下所有流量都加密 Equivalent of running everything over HTTPS相当于使用HTTPS运行所有内容 | |
| Server push/hint服务器推送/提示 Server can push resources before being requested服务器可以在请求之前推送资源Server can “hint” that clients fetch resources (e.g. if the server knows the client will need something in the future) 服务器可以“提示”客户端提取资源 | |
| Content prioritisation内容优先级 Specify preferred order and priority that server transfers resources to clien指定服务器向客户端传输资源的首选顺序和优先级 |
