You’ve probably heard this popular question. Or maybe you were simply curious about what happens when you type “google.com.” But before exploring the question, let’s choose my favorite domain: “holbertonschool.com.”
The answer to this question involves details of how your computer interacts with another computer connected via the Internet, covering the following sections:
- Keyboard input
- Event handling
- Client and Server
- DNS Lookup
- OSI model
- Web server
- Load balancing
When you type holbertonschool.com in the browser, you type it from your computer’s keyboard, and as you are typing it, your keyboard emits an event—that is, it signals the operating system (OS) that a state has been changed, and your OS records this change and responds to it. It’s like the moment you touch a hot pan—Your brain processes this and prepares your body for response (move hand away).
There are flags and keycodes to detect when a key is pressed, map which key is pressed and generate a response accordingly. Different keys on your keyboard evoke different responses. So when a key is pressed, the kernel signals the OS that it needs its immediate attention and resources. The OS makes the CPU (or central processing unit, a.k.a., the brain of the computer) respond to it by suspending its current activities, saving its state and executing the interrupt handler function. If you had typed “holbertonschool.com” in a text editor, your OS would have let the dedicated text editor program handle your interactions with it. But since you typed in the browser, it will let the browser application handle it.
What is a browser?
Let’s say you want to order food, and you choose to use DoorDash. While your food is getting ready, the DoorDash driver goes to the restaurant—then grabs your food and delivers it to you. Voila! In minutes, you have your food.
But who is actually serving you—the restaurant or DoorDash? In this case, DoorDash is the middleman, letting the restaurant serve you.
Similarly, the browser is the medium that allows you to make a request and lets a server serve you. It’s software installed and running on your computer that lets you search the Internet. It takes your input, creates and sends a request, gets the response and serves you.
But wait—How does a DoorDash driver know which restaurant to go to and how to find it? (Of course, Google Maps.) Now, how does your browser know which server to send the request to? Yes, you guessed it right, it needs to find its address. So, it queries the DNS (Domain Name Server) to find the IP.
The DNS is the Internet’s version of Google Maps. It routes you to your destination. Your computer or your router knows the address of the DNS server. When you type the URL in a browser for the first time, it sends a request to the DNS server, which responds back with the IP address of the web server hosting, for example, holbertonschool.com. This value is usually then cached or gets added into the list of known hosts, so your browser doesn’t have to do this lookup every time.
Now that your browser knows the IP address of the server, it needs to find a way to pass this request all the way to the server. When you place the order, it’s not just you interacting with DoorDash. There’s another end being managed—with the need to check that the restaurant is ready to accept the order, handle billing and payments, find the most reliable driver, and so on. Similarly, there’s a lot of stuff that needs to be managed for smooth communication between browser and server.
There’s something called an OSI (Open System Interconnection) model that standardizes communication between different computing machines (ref. Wikipedia). It describes the flow of information from one computer to another. It defines seven layers, and the interplay of these layers magically brings, for example, holbertonschool.com from server to your machine. At both ends (client and server), these layers are followed, but there is a difference in the flow of which layer kicks in first. When your browser sends the request, communication starts at the application layer and goes down to the physical layer—whereas in the server, while receiving the request, it would start at the physical layer, going up. On the other hand, when a server is responding to your browser’s request, it would go from application layer to physical layer—and when your computer receives the response, it would first go to the physical layer all the way back to the application layer.
7. Application layer: consists of protocols that directly interact with the end user. A protocol defines how different applications across machines communicate with each other. If you are requesting a web page, HTTP (Hyper Text Transfer Protocol) will handle it, and if you are sending an email, SMTP (Simple Mail Transfer Protocol) will handle it. So in the case of holbertonschool.com, your browser generates a HTTP request. Don’t confuse the browser as part of the application layer. The role of application layers comes in when your browser creates a HTTP request. This HTTP request is part of the application layer.
6. Presentation layer: Depending on your request (image, video, text, GIF, etc.), this layer converts and presents the data in readable format. In the case of holbertonschool.com, when your machine received it, the presentation layer would kick in to render it as a HTML page.
5. Session layer: responsible for establishing, maintaining, and terminating the session between devices. For example, when you are doing video chat, the time you enter into the chat to the time you leave it is one complete session, given there were no interruptions during that interval. However, in the case of holbertonschool.com, HTTP uses lower layer protocol, instead of session layer protocols.
4. Transport layer: takes care of the reliability, safety and security of the path taken between the request and response. Here, the transportation, delivery and assembling of data takes place. When you are requesting holbertonschool.com, essentially, you are not sending any data, but the role of this layer is more evident when you receive the data. The data your machine receives comes divided into packets with a sequence number assigned to each packet, called data payloads. This layer makes sure that you have received all packets and reassembles them in order. As I mentioned above, HTTP uses the TCP (Transport Layer Protocol) instead of session layer protocols for establishing and maintaining a connection from your machine to the server to ensure reliable delivery. For security, it uses SSL (Secure Sockets Layer), which encrypts all data passed between browser and the web server, making all communications private and integral. In HTTP requests, it’s the job of TCP protocols to ensure fast and efficient delivery. In a similar way, DoorDash has to make sure that all the requests are served well and are distributed across drivers.
3. Network layer: This organizes and routes the data. It also decides which transfer protocols to use. So in the case of holbertonschool.com, the best path to route the data between your machine and web server is determined by the IP (Internet Protocol).
2. Data link layer: In this layer, data is broken down into pieces. So when the server sends holbertonschool.com, it doesn’t send the entire page all at once; rather, the data link layer segments it, encapsulates it and transmits it as packets (data payloads) through the physical layer. It is not necessary that the packets be delivered directly to your machine. It may travel from network to network, passing through many machines before reaching you. So in this case, IP addresses with all of these hops are translated to hardware addresses, at the data link layer.
1. Physical layer: The physical layer deals with the actual connectivity between your machine and the server. The hardware and signaling and encoding mechanisms required to form the actual connection are defined at this layer, and the data received from the server is in the form of raw bits. Try “ifconfig” in your terminal to check out the network interface configuration of your system.
So far, I have mainly talked from the client perspective. It’s time to understand what happens at the server end.
Popular websites have to serve several thousands of concurrent requests and return correct text, image and video responses to them. To serve a large number of requests, the content is usually distributed across multiple servers. A load balancer sits in front of these servers and acts as a traffic cop to direct traffic to the right server. It makes sure that no server is overloaded, and provides high availability and reliability by ensuring all requests are served. If a server goes down, it starts redirecting the requests to different servers that are online.
Web servers use a firewall to protect the system against breaches and attacks. For example , if a bad source starts flooding the web server with a large number of concurrent requests, the firewall will detect the problem and block requests from that IP address to keep them from reaching the web server.
Now, type in www.holbertonschool.com. In seconds, you’re at the site and you can enjoy your visit, thanks to the power of the Internet.