There and Back Again: An HTTPS Request’s Tale, part 4
19 Feb 2019In the previous post we learned about how the browser and server establish a connection through the TCP 3-way handshake. Now we move on to the final step of our journey where the browser makes an HTTP request to the server to get the HTML content of the website. We will go over the basic structure of an HTTP request and response and how information is organized in them. We will use our previous example of https://www.google.com.
An HTTP request is sent from the browser to the server and is illustrated below.
GET / HTTP/1.1
Host: google.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6)
Cookie: example-cookie=chocolatechip
<---- "blank line"
<---- "empty body"
It consists of four parts:
- A start line which contains: The HTTP method, the url of the page of the website that we want to request information for, and the version of HTTP that the browser is using. In our example shown above the HTTP method is GET, the url is “/” which means the root page, and the HTTP version is HTTP/1.1.
- Headers which can contain a variety of things such as: the host of the website (google.com), the user agent which is the browser and operating system (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6)), any cookies that the browser wants to send to the server (example-cookie=chocolatechip). Note in this example our headers only have three lines of information, but an actual request may have many more.
- A blank line indicating the end of the headers.
- An optional body located directly below the blank line. In our example the body is blank since we are just requesting information.
When the server receives the HTTP request, it sends an HTTP response back to the browser. An HTTP response is illustrated below.
HTTP/1.1 200 success
Set-Cookie: example-cookie=chocolatechip
Content-Type: text/html; charset=utf-8
<html>
<body>
Google website
</body>
</html>
It also consists of four parts:
- A status line which contains: the HTTP version, the status code, and the status text. The status code is a numerical code representing the status of the response. For example a response code of 200 means a successful response, or a code of 500 means something went wrong with the server.
- Headers such as: Set-Cookie which instructs the browser to set a cookie to a certain value, and Content-Type which specifies the media type of the response. Note an actual response may have many more headers.
- A blank line.
- A response body. Our response body contains the HTML of google’s homepage (very simplified for clarity in our example).
Now finally the browser has the HTML it needs to display www.google.com! This ends our journey which started with DNS, then to the HTTPS SSL handshake, then to the TCP 3 way handshake, and finally to the HTTP request/response. I hope you enjoyed the journey!