How does the Internet work? The videos you watch on youtube, traveled thousands of miles from a Google data center to reach you. Let’s learn how the Internet works by getting to understand the details of this data’s incredible journey.
A data center, which can be thousands of miles away from you, this article stored inside it. How does this data reach your mobile phone or laptop? An easy way to achieve this goal would be with the use of satellites. From the data center, a signal could be sent to the satellite via an antenna, and then from the satellite a signal could be sent to your mobile phone via another antenna near to you, this is illustrated in Fig:1. However, this way of transmitting signals is not a good idea. Let’s see why.
The satellite is parked nearly 22,000 miles above the earth’s equator. So in order for the data transmission to be successful the data would have to travel a total distance of 44000 miles. Such a long distance of travel causes a significant delay in receiving the signal; more specifically it causes huge latency, which is unacceptable for most Internet applications. Moreover, this type of communication is exposed to weather conditions, demands clear line of sight and offers low bandwidth.
So if this webpage does not reach you via a satellite, then how does it actually get to you? Well it is done with the help of a complicated network of optical fiber cables, which connect between the data center and your device. Your phone could be connected to the Internet via cellular data or any WiFi router, but ultimately, at some point, your phone will be connected to this network of optical fiber cables (Fig:2A).
Any youtube videos you watch, is stored inside a data center; to be more specific it is stored in a solid-state device (SSD) within the data center (Fig: 3A). This SSD acts as the internal memory of a server. The server is simply a powerful computer (Fig: 3B), whose job is to provide you the video, or other stored content, when you request it. Now the challenge is how to transfer the data stored in the data center, specifically to your device, via the complex network of optical fiber cables? Let’s see how this is done.
Before proceeding further we should first understand an important concept, which is the concept of an IP address. Every device that is connected to the Internet, whether it is a server, a computer, or a mobile phone, is identified uniquely by a string of numbers known as an IP address (Fig:4).
You can consider the IP address similar to your home address, i.e. the address that uniquely identifies your home (Fig:5A). Any letter sent to you reaches you precisely because of your home address. Similarly in the Internet world, an IP address acts as a shipping address through which all information reaches its destination. Your Internet service provider will decide the IP address of your device, and you are able to see what IP address your ISP has given to your mobile phone or laptop (Fig:5B).
The server in the data center also has an IP address. The server stores a website, so you can access any website just by knowing the server’s IP address. However, it is difficult for a person to remember so many IP addresses, so to solve this problem, domain names like youtube.com, facebook.com etc. are used which correspond to IP addresses, which are easier for us to remember than the long sequence of numbers. Another thing to notice here is that a server has the capability of storing several websites (Fig: 6), and if the server consists of multiple websites, all the websites cannot be accessed with the server’s IP address. In such cases additional pieces of information, host headers, are used to uniquely identify the website. However, for the giant websites like facebook.com or youtube.com the entire data center infrastructure will be dedicated to the storage of the particular website.
To access the Internet, we always use domain names instead of the complex IP address numbers. From where does the Internet get IP addresses corresponding to our domain name requests? Well, for this purpose the Internet uses a huge phonebook known as DNS (Domain Name System). If you know a person’s name but don’t know their telephone number, you can simply look it up in a phone book (Fig:7A). The DNS server provides the same service to the Internet. Your Internet service provider or other organizations can manage the DNS server (Fig:7B).
Let’s have a recap of the whole operation. You enter the domain name; the browser sends a request to the DNS server to get the corresponding IP address; after getting the IP address your browser simply forwards the request to the data center, more specifically to the respective server as shown in Fig:8. To save time in the future, your browser also saves this IP address in its cache memory so it will be there next time you want to access that particular site.
Once the server gets a request to access a particular website, the data flow starts. The data is transferred in digital format via optical fiber cables, more specifically in the form of light pulses(Fig: 9A). These light pulses sometimes have to travel thousands of miles via the optical fiber cable to reach their destination. During their journey they often have to go through tough terrains such as hilly areas or under the sea. There are a few global companies(Oragne, AT&T, Verizon, Google, etc.) who lay and maintain these optical cable networks. The optical fiber cables are laid with the help of plow. A plow is dropped deep into the sea from the ship (Fig: 9B), and this plow creates a trench on the seabed into which is placed the optical fiber cable (Fig:9C). In fact, this complex optical cable network is the backbone of the Internet.
These optical fiber cables, carrying the light, are stretched across the seabed to your doorstep, where they are connected to a router. The router converts these light signals to electrical signals. An Ethernet cable is then used to transmit the electrical signals to your laptop as shown in Fig: 10A. However, if you are accessing the Internet using cellular data, from the optical cable the signal has to be sent to a cell tower, and from the cell tower the signal reaches your cellphone in the form of electromagnetic waves as shown in Fig: 10B.
ICANN stands for Internet Corporation for Assigned Names and Numbers. Since the Internet is a global network, it has become important to have an organisation to manage things like: IP address assignment, domain name registration etc. This is all managed by an institution called, ICANN, located in the USA.
One amazing thing about the Internet is its efficiency in transmitting data when compared with cellular and landline communication technologies. This video you are watching from the Google data center is sent to you in the form of a huge collection of 0s and 1s. What makes the data transfer in the Internet efficient is the way in which these 0s and 1s are chopped up into small chunks known as packets and transmitted (Fig:12).
Let’s assume, these streams of 0s and 1s are divided into different packets by the server where each packet consists of 6 bits (Fig:13A). Along with the bits of the video, each packet also consists of the sequence number and the IP addresses of the server and your phone. With this information, the packets are routed towards your phone. It's not necessary that all packets are routed through the same path, and each packet independently takes the best route available at that time as shown in Fig:13B. Upon reaching your phone, the packets are reassembled according to their sequence number. If it is the case that any packets fail to reach your phone, an acknowledgement is sent from your phone to resend the lost packets.
Now compare this with a postal network with a good infrastructure, but the customers do not follow the basic rules regarding the destination addresses. In this scenario letters won’t be able to reach the correct destination. Similarly in the Internet we use something called protocols for the management of this complex flow of data packets (Fig:14). The protocols set the rules for data packet conversion, attachment of the source and destination addresses to each packet, and the rules for routers etc. For different applications, the protocols used are different.