HTTP
Introduction
World Wide Web (WorldWideWeb) originated from CERN, the Quantum Physics Laboratory in Geneva, Europe. It is the emergence of WWW technology that makes the Internet faster than you can imagine develop. This TCP/IP-based technology has quickly become the largest information system on the Internet that has been developed for decades. Its success is attributed to its simplicity and practicality. Behind the WWW, there are a series of protocols and standards that support it to complete such a magnificent work. This is the Web protocol family, which includes the HTTP hypertext transfer protocol.
In 1990, HTTP became the supporting protocol of the WWW. At that time, it was proposed by Tim Berners-Lee, the father of its founder, WWW, and then the WWW Consortium was established to organize the IETF (Internet Engineering Task Force) group to further improve and release HTTP.
HTTP is an application layer protocol. Like other application layer protocols, HTTP is used to implement a certain type of specific application protocol, and is implemented by an application running in the user space. HTTP is a protocol specification, which is recorded in the document and is the implementation program of HTTP that truly communicates through HTTP.
HTTP communicates based on the B/S architecture, and the server-side implementation programs of HTTP include httpd, nginx, etc., and the client-side implementation programs are mainly web browsers, such as Firefox, InternetExplorer, GoogleChrome, Safari , Opera, etc. In addition, the client's command line tools include elink, curl, etc. Web services are based on TCP, so in order to be able to respond to client requests at any time, the Web server needs to listen on port 80/TCP. In this way, the client browser and the Web server can communicate via HTTP.
Development stage
0.9
The 0.9 protocol is a concise and fast protocol suitable for various data information, but it is far from meeting the needs of various applications that are developing day by day . The 0.9 protocol is an out-of-order protocol for exchanging information, limited to text. Since the content cannot be negotiated, in the handshake and agreement of the dual-transmission, there is a stipulation about the content of the dual-transmission, that is, the picture cannot be displayed and processed.
1.0
At the 1.0 protocol stage, in 1982, Tim Berners-Lee proposed HTTP/1.0. In the continuous enrichment and development since then, HTTP/1.0 has become the most important transaction-oriented application layer protocol. The protocol establishes and tears down a connection for each request/response. Its characteristics are simple and easy to manage, so it meets everyone's needs and has been widely used.
1.1
In the 1.0 protocol, the two parties stipulated the connection method and connection type, which has greatly expanded the field of HTTP, but the most important speed and efficiency for the Internet, and There is not much consideration. After all, as the creator of the protocol, I didn't expect that HTTP would be popularized so quickly.
For the specific content of the HTTP1.1 protocol, please refer to RFC2616.
2.0
The predecessor of HTTP2.0 is HTTP1.0 and HTTP1.1. Although there were only two versions before, the protocol specifications contained in these two versions are huge enough to make any experienced engineer a headache. The new version of the network protocol will not immediately replace the old version. In fact, 1.0 and 1.1 have coexisted for a long period of time, which is determined by the slow update of the network infrastructure.
For the specific content of the HTTP2.0 protocol, please refer to RFC7540.
Application scenarios
At the beginning of the birth of HTTP, it was mainly used for web content acquisition. At that time, the content was not as rich as it is now, and the layout was not so exquisite. The scene of user interaction was almost No. For this kind of simple scene of obtaining web content, HTTP performs pretty well. But with the development of the Internet and the birth of WEB2.0, more content began to be displayed (more picture files), typesetting became more exquisite (more CSS), and more complex interactions were introduced (more JS). The total amount of data loaded and the number of requests for users to open the homepage of a website are also increasing.
The size of the homepage of most portals today will exceed 2M, and the number of requests can be as many as 100. Another widespread application is the client app on the mobile Internet. Apps of different natures use HTTP differently. For e-commerce apps, there may be more than 10 requests to load the homepage. For IMs such as WeChat, HTTP requests may be limited to the download of voice and image files, and the frequency of requests is not high.
Working Principle
HTTP is based on the client/server model and is connection-oriented. Typical HTTP transaction processing has the following process: (1) The client establishes a connection with the server;
(2) The client makes a request to the server;
( 3) The server accepts the request and returns the corresponding file as a response according to the request;
(4) The client and the server close the connection.
The HTTP connection between the client and the server is a one-time connection, which restricts each connection to only process one request. When the server returns a response to this request, it will immediately close the connection, and the next request Re-establish the connection. This one-time connection mainly considers that the WWW server is for thousands of users on the Internet and can only provide a limited number of connections. Therefore, the server will not leave a connection in a waiting state, and releasing the connection in time can greatly improve the server’s performance. effectiveness.
HTTP is a stateless protocol, that is, the server does not retain any state during transactions with customers. This greatly reduces the memory burden of the server, thereby maintaining a faster response speed. HTTP is an object-oriented protocol. Any type of data object is allowed to be transferred. It identifies the content and size of the transmitted data through the data type and length, and allows the data to be compressed and transmitted. When the user defines a hypertext link in an HTML document, the browser will establish a connection with the specified server through the TCP/IP protocol.
HTTP supports persistent connections. In HTTP/0.9 and 1.0, the connection is closed after a single request/response pair. In HTTP/1.1, a keep-alive mechanism was introduced, where the connection can be reused for multiple requests. Such a persistent connection can significantly reduce the request delay, because after sending the first request, the client does not need to renegotiate the TCP3-Way-Handshake connection. Another positive side effect is that, usually, the connection becomes faster over time due to TCP's slow start mechanism.
The 1.1 version of the protocol also optimizes the bandwidth of HTTP/1.0. For example, HTTP/1.1 introduced chunked transfer coding to allow streaming rather than buffering content on persistent connections. The HTTP pipeline further reduces latency, allowing the client to send multiple requests before waiting for each response. Another additional function of the protocol is byte service, that is, the server only transmits the part of the resource explicitly requested by the client.
Technically speaking, the client opens a socket on a specific TCP port (the port number is generally 80). If the server has been listening for a connection on this well-known port, the connection will be established. Then the client sends a request block containing the request method through the connection.
The HTTP specification defines 9 request methods. Each request method specifies a different information exchange method between the client and the server. The commonly used request methods are GET and POST. The server will complete the corresponding operation according to the client's request, and return it to the client in the form of a response block, and finally close the connection.
How it works
In the WWW, "client" and "server" are a relative concept, which only exists during a specific connection period, that is, the client in a certain connection May act as a server in another connection. The information exchange process based on the HTTP client/server model is divided into four processes: establishing a connection, sending request information, sending response information, and closing the connection.
HTTP is based on the request/response paradigm. After a client establishes a connection with the server, it sends a request to the server. The format of the request method is uniform resource identifier, protocol version number, followed by MIME information including request modifiers, client information, and possible content. After receiving the request, the server gives the corresponding response information, the format of which is a status line including the protocol version number of the information, a success or error code, and MIME information including server information, entity information, and possible content. In fact, to put it simply, in addition to HTML files, any server also has an HTTP resident program for responding to user requests. Your browser is an HTTP client and sends a request to the server. When a start file is entered in the browser or a hyperlink is clicked, the browser sends an HTTP request to the server, which is sent to the IP address specified URL. The resident program receives the request, and returns the requested file after performing the necessary operations. In this process, the data sent and received on the network has been divided into one or more data packets (packet), each data packet includes: the data to be transmitted; control information, that is, tell the network how to deal with the data packet. TCP/IP determines the format of each data packet. If you do not tell you in advance, you may not know that the information is divided into many small pieces for transmission and recombination.
Many HTTP communications are initiated by a user agent and include a request for resources on the origin server. The simplest case may be a separate connection between the user agent (UA) and the origin server (O).
When one or more intermediaries appear in the request/response chain, the situation becomes more complicated. There are three types of intermediaries: Proxy, Gateway and Tunnel. An agent accepts the request according to the absolute format of the URI, rewrites all or part of the message, and sends the formatted request to the server through the URI identification. The gateway is a receiving agent, as the upper layer of some other servers, and if necessary, can translate the request to the lower server protocol. A channel acts as a relay point between two connections that do not change the message. When the communication needs to go through an intermediary (such as a firewall, etc.) or the intermediary cannot identify the content of the message, the channel is often used.
Message format
HTTP message consists of a request from the client to the server and a response from the server to the client. The format of the request message is as follows:
Request line-general information header-request header-entity header-message body
The request line starts with the method field, followed by the URL field and HTTP. The protocol version field ends with CRLF. SP is the separator. Except that CF and LF are required in the final CRLF sequence, everything else can be omitted. For the general information header, request header and entity header, please refer to the relevant documents.
The response message format is as follows:
Status line-general information header-response header-entity header-message body
The status code element consists of 3 digits Composition, indicating whether the request is understood or fulfilled. Cause analysis is a brief description of the status code of the original text. The status code is used to support automatic operations, and the cause analysis is used for users. The client does not need to be used to check or display syntax. For general information headers, response headers, and entity headers, you can refer to relevant documents for specific content.
Status message
Message | Description |
---|---|
100Continue | The server only received part of the request, but once the server did not refuse For this request, the client should continue to send the remaining requests. |
101SwitchingProtocols | Server conversion protocol: The server will follow the client’s request and switch to another Kind of agreement. |
message | Description |
---|---|
200OK | The request is successful (there are The response document of the request.) |
201Created | The request is created and the new The resource is created. |
202Accepted | The request for processing has been accepted, but the processing has not been completed. |
203Non-authoritativeInformation | The document has been returned normally, but some response headers may not Correct, because a copy of the document is used. |
204NoContent | There are no new documents. The browser should continue to display the original document. If the user refreshes the page regularly, and the servlet can determine that the user document is new enough, this status code is useful. |
205ResetContent | There are no new documents. But browseThe device should reset what it displays. Used to force the browser to clear the form input. |
206PartialContent | The client sends a GET request with a Range header, and the server completes Got it. |
Message | Description |
---|---|
300MultipleChoices | Multiple choices. List of links. The user can select a link to reach the destination. Up to five addresses are allowed. |
301MovedPermanently | The requested page has been transferred to the new url. |
302Found | The requested page has been temporarily transferred to the new url. |
303SeeOther | The requested page can be found under other urls. |
304NotModified | The document was not modified as expected. The client has a buffered document and sends a conditional request (generally, the If-Modified-Since header is provided to indicate that the client only wants documents that are newer than the specified date). The server tells the client that the original buffered document can still be used. |
305UseProxy | The document requested by the client should be retrieved through the proxy server specified in the Location header . |
306Unused | This code was used before One version. It is no longer used, but the code is still retained. |
307TemporaryRedirect | The requested page has been temporarily moved to the new url. |
message | < th>|
---|---|
400BadRequest | The server failed to understand the request. |
401Unauthorized | The requested page requires a username and password. |
401.1 | Login failed. |
401.2 | The server configuration caused the login failure. |
401.3 | Unauthorized due to ACL restrictions on resources. |
401.4 | The filter authorization failed. |
401.5 | ISAPI/CGI application authorization failed. |
401.7 | Access is denied by the URL authorization policy on the Web server. This error code is dedicated to IIS6.0. |
402PaymentRequired | This code is not yet available. |
403Forbidden | Access to the requested page is forbidden. |
403.1 | Execution access is prohibited. |
403.2 | Read access is forbidden. |
403.3 | Write access is prohibited. |
403.4 | Require SSL. |
403.5 | Require SSL128. |
403.6 | The IP address was rejected. |
403.7 | Requires client certificate. |
403.8 | Access to the site was denied. |
403.9 | Too many users. |
403.10 | The configuration is invalid. |
403.11 | Password change. |
403.12 | Access to the mapping table is denied. |
403.13 | The client certificate was revoked. |
403.14 | Reject directory listings. |
403.15 | Exceeded client access permission. |
403.16 | The client certificate is untrusted or invalid. |
403.17 | The client certificate has expired or is not yet valid. |
403.18 | The requested URL cannot be executed in the current application pool. This error code is dedicated to IIS6.0. |
403.19 | Cannot execute CGI for clients in this application pool. This error code is dedicated to IIS6.0. |
403.20 | Passport login failed. This error code is dedicated to IIS6.0. |
404NotFound | The server cannot find the requested page. |
404.0 | (none)-No file or directory was found. |
404.1 | The website cannot be accessed on the requested port. |
404.2 | Web service extension locking strategy prevents this request. |
404.3 | MIME mapping policy prevents this request. |
405MethodNotAllowed | The method specified in the request is not allowed. |
406NotAcceptable | The response generated by the server cannot be accepted by the client. |
407ProxyAuthenticationRequired | The user must first use a proxy server for authentication, so that the request will be deal with. |
408RequestTimeout | The request exceeded the server's waiting time. |
409Conflict | The request could not be completed due to a conflict. |
410Gone | The requested page is not available. |
411LengthRequired | "Content-Length" is not defined. If there is no such content, the server will not accept the request. |
412PreconditionFailed | The precondition in the request is evaluated as failed by the server. |
413RequestEntityTooLarge | Because the requested entity is too large, the server will not accept the request . |
414Request-urlTooLong | Because the url is too long, the server will not accept the request. This happens when a post request is converted to a get request with a long query information. |
415UnsupportedMediaType | Because the media type is not supported, the server will not accept the request. |
416RequestedRangeNotSatisfiable | The server cannot satisfy the Range header specified by the client in the request. |
417ExpectationFailed | The execution failed. |
423 | Locked error. |
message | Description |
---|---|
500InternalServerError | The request was not completed. The server encountered an unpredictable situation. |
500.12 | The application is busy restarting on the web server. |
500.13 | The web server is too busy. |
500.15 | It is not allowed to directly request Global.asa. |
500.16 | UNC authorization credentials are incorrect. This error code is dedicated to IIS6.0. |
500.18 | URL authorization store cannot be opened. This error code is dedicated to IIS6.0. |
500.100 | Internal ASP error. |
501NotImplemented | The request was not completed. The server does not support the requested function. |
502BadGateway | The request was not completed. The server received an invalid response from the upstream server. |
502.1 | The CGI application timed out. |
502.2 | An error occurred in the CGI application. |
503ServiceUnavailable | The request was not completed. The server is temporarily overloaded or down. |
504GatewayTimeout | Gateway timeout. |
505HTTPVersionNotSupported | The server does not support the HTTP version specified in the request. |
Version number
HTTP request and response messages include the HTTP version number, the correct use and interpretation of the HTTP version number, and differences There is some confusion about the interoperability of the HTTP implementation of protocol conversion. To use and explain the HTTP version number, please refer to RFC2145.
Latest: 3D animation software
Next: New teaching model