Recently, I have been preparing for a front-end interview, and this question is basically a must-ask question. I have searched for some information online and organized it myself.
The overall process is as follows:
- Enter URL
- DNS resolution
- Establish TCP connection
- Send HTTP request
- Server permanent redirection
- Server processes the request and returns an HTTP response
- Browser displays HTML
- Link ends
Enter URL#
The Chinese name of URL is Uniform Resource Locator, which is used to obtain the location and access method of resources.
It consists of: protocol://hostname/path/;parameters?query#fragment
DNS resolution#
DNS (Domain Name System) is a distributed database on the Internet that maps domain names to IP addresses. The process of obtaining the IP address corresponding to the hostname is called domain name resolution.
The process of DNS resolution is actually to find which machine has the required resources. It actually acts as a translator, converting the entered URL into an IP address.
Here is a lookup order for DNS:
- Browser cache: Read access records from the browser cache
- Operating system cache: Look up the cache in the system running memory
- Host file: Look up the host file on the local hard disk
- Router cache: Some routers will cache visited domain names
- ISP (Internet Service Provider) DNS cache: If the local lookup fails, the ISP will look up in the cache of the current server
- Root DNS server: The root domain receives the request, determines which server manages it, and returns the IP of the top-level DNS server to the requester.
After the lookup is completed, the local DNS server sends a request to the domain name resolution server, and the local server returns the IP address to the computer and saves the corresponding relationship in the cache.
Expansion:
DNS query methods:
- Recursive: The local DNS server is responsible for querying other DNS servers (generally first query the root domain server, and then query down level by level), and the result is returned to the local DNS server and then returned to the client.
- Iterative: The local DNS server gives the IP addresses of other DNS servers that can resolve the domain name to the client DNS program, and then the program queries these DNS servers (used when the local DNS server cannot answer the client DNS query).
DNS optimization methods:
- DNS caching
- DNS load balancing
- Why is it needed: When the requested resources are all on the same machine, the machine may not be able to handle it and crash.
- Principle: Configure multiple IP addresses for a hostname, and return different results for each query in the DNS file according to the order of the recorded IP addresses, guiding the access to different machines.
Establish TCP connection#
After obtaining the IP address, it is time to establish a TCP connection through a three-way handshake.
- First handshake: The client sends a SYN (synchronize sequence number) packet to the server and enters the SYN_SENT state, waiting for the server to confirm.
- Second handshake: After the server receives the SYN packet, it confirms and also sends a SYN packet, that is, SYN+ACK packet, and the server enters the SYN_RECE state.
- Third handshake: The client receives the SYN+ACK packet from the server and sends an acknowledgment packet ACK to the server. After sending, the client and the server enter the ESTABLISHED state.
Expansion:
Why three-way handshake: To prevent the transmission of invalid connection request packets from suddenly reaching the server and causing errors.
Send HTTP request#
After establishing a TCP connection, the client initiates an HTTP request. The HTTP message consists of three parts:
- Request line: Request method + URL + protocol/version
- Request header: Transmits additional information about the request and client itself
- Request body: Data to be transmitted
Server permanent redirection#
The server responds to the browser with a 301 permanent redirection response, for example, accessing http://google.com/ will automatically redirect to http://www.google.com/
Purpose:
- This will group the visits to the addresses with and without www under the same website ranking, so the website's ranking in search links will not be lowered.
- Using different addresses will result in poor cacheability. When a page has multiple names, it may appear multiple times in the cache.
Server processes the request and returns an HTTP response#
After receiving the TCP packet from the fixed port, the backend processes the TCP and parses the HTTP protocol. It further encapsulates it into an HTTP Request object according to the message format for upper-layer use.
An HTTP response consists of four parts:
- Status line: Protocol version, status code, status description
- Response header: Consists of key-value pairs, one pair per line, separated by ":"
- Blank line: Separates the request data
- Response body
Expansion:
In larger websites, requests are sent to reverse proxies, and the same application is deployed on multiple servers to distribute a large number of user requests to multiple machines.
That is, the client first requests Nginx, Nginx requests the application server, and finally returns the result to the client.
Browser displays HTML#
Displaying HTML in the browser is a process of parsing and rendering while parsing. The general process is as follows:
- Parse the HTML file to build the DOM tree
- Parse the CSS file to build the render tree
- The browser starts layout and renders the render tree and draws it on the screen
Expansion:
About reflow and repaint:
- Each element in the DOM node exists in the form of a box model, and the process of calculating its position, size, and other attributes by the browser is called reflow.
- After these attributes are determined, the browser starts to draw the content, and this drawing process is called repaint.
Reflow and repaint are definitely required during the initial loading of the page, but these two processes are very performance-consuming and should be minimized as much as possible.
JS parsing and execution mechanism:
When encountering a JS file during the parsing process, the HTML document suspends the rendering thread and waits for the JS file to be loaded and parsed (because JS may modify the DOM, such as document.write). Therefore, JS code is usually placed at the end of the HTML.
JS parsing is done by the JS parsing engine in the browser. JS is single-threaded, but tasks that are time-consuming, such as IO reading and writing, require a mechanism that can execute tasks that are queued later, that is, synchronous tasks and asynchronous tasks.
The execution mechanism of JS can be regarded as a main thread + a task queue.
Synchronous tasks are tasks on the main thread, forming a stack on the main thread;
Asynchronous tasks are tasks in the task queue, and when there are results, an event is placed in the task queue.
The script first runs the stack, and then extracts events from the task queue and runs the tasks inside.
This process is repeated in a loop, also known as the event loop.
Link ends#
Nowadays, in order to optimize request latency, TCP connections are generally kept alive, and the TCP connection is disconnected when the current page is closed.
Next is the four-way handshake to disconnect the TCP connection:
- The host sends a FIN, and the host enters the FIN_WAIT_1 state.
- The server receives the FIN and sends an ACK to the host, confirming the sequence number as the received sequence number + 1, and the server enters the CLOSE_WAIT state.
- The server sends a FIN packet to close the data transmission and enters the LAST_ACK state.
- The host receives the FIN, enters the TIME_WAIT state, and then sends an ACK to the server to ensure that the server enters the CLOSED state after receiving its own ACK packet.