This document is designed to be a brief overview of several common network communication protocols, suitable for a day or two overview of a set of topics that we also teach entire courses about. As such it glosses over or completely ignores many important topics.
In general, any piece of communication to be handled by intermediaries needs two pieces of information: the contents of the communication and the routing information needed to reach the intended destination.
For paper communication, these would be the letter and the envelope, each made of paper and each with information written on them but one (the letter) containing the contents and the other (the envelope) containing the routing information. We know which one is which by how they are arranged: the envelope is outside the letter.
If we wanted to send an entire book we could invest in a different sending protocol such as a box instead of an envelope, but we could also take a few pages at a time and put each in an envelope. As long as the pages are numbered, the recipient could reassemble the book no matter what order the envelopes arrived in. If we had a book with unnumbered pages, we could add the ordering to the envelope instead to still permit reassembling the book.
Common network communications use many of the same ideas as paper post.
Many protocols split the message into packets of some maximum (or in a few cases fixed) size, sending each separately with some sort of reassembly packet order numbers.
The name for the chopped-up data depends on the layer.
Paper post is typically transmitted part of the way by a letter inside an envelope inside a crate inside a truck; and another part by a letter inside an envelope inside a mail pouch; etc. Each of these outer containers has routing information, like the envelope does: the carrier route of the pouch, the post office destination of the crate, etc.
Digital communication also puts containers inside containers and the set of containers used are called layers
There are different numbers of layers possible, but the OSI model is often seen as the default set even though the Internet Protocol Suite is better known. Note that OSI numbers its layers, where containers have smaller numbers than their contents.
Consider using a SOCK_STREAM
socket
, as you do in MP4. When you write
a message, the result is
That last layer will be stripped off after traversing the WLAN, to be replaced by a different outer envelope
to guide it over the next link on its way to its destination.
Note that because of the header/footer design, every layer can slice the data provided by the other layers into smaller pieces if it wishes. Thus, a message could be split into 4 TCP segments; those each split into 2 IP packets to make 8 total packets. Multiple messages can also be placed into a single envelope, though that is less common in practice.
While there are many networking protocols, three are pervasive enough to be worth ensuring every CS major has at least some understanding of them: TCP, IP, and UDP. These sit in the middle of the network layer stack: they transmit higher-level protocols like HTTP, SMTP, SSH, etc; and they are transmitted by lower-level protocols like wi-fi, ethernet, etc.
Layer | Protocols | Use | Notes |
---|---|---|---|
Application | HTTP, SSH, SMTP, etc. | programmer-visible messages | |
Transport | TCP, UDP | reach a specific program | there are other transport layer protocols too |
Internet | IP, which has two versions: IPv4 and IPv6 | reach a specific computer | there are other internet layer protocols too |
Link | MAC, Wi-Fi, etc. | travel over wires and the air |
IP, or Internet Protocol, has two versions in use (IPv4 and IPv6). There are some important technical updates in version 6, and several other nuanced ideas this write-up ignores.
IP (both versions) works on the following principles:
Every computer has a unique numeric IP Address
.
IPv4 typically writes its 32-bit addresses as 4 8-bit decimal values with periods in between them, as in 255.127.63.31
for 0xFF7F3F1F
.
IPv6 typically writes its 128-bit addresses as 8 16-bit hexadecimal values separated with colons in square brackets, with a run of all-zero values omitted, as in [c0a:2::2020]
for 0x0c0a0002000000000000000000002020
.
The fact that there are more than 232 computers is part of the motivation to switch from IPv4 to IPv6.
IP headers contain the IP address of both the originating and end-destination computer.
Messages are delivered via best-effort delivery
. Each computer involved in delivering a packet sends it to the computer it has a link to that it believes is most likely to get the message where it is going, but there is no guarantee that it will arrive at all or that the sender will know if it arrived.
UDP, or User Datagram Protocol, is a light-weight transport protocol. It adds 16-bit source and destination port numbers that the OS of the involved computers can use to ensure it gets to the correct program; and a checksum to detect errors that might arise during transmission.
UDP adds no other features on top of the underling IP: messages are not guaranteed to arrive at all, nor in any particular order; cannot be dynamically split into different sizes; etc. As such, it adds little if any delay on top of IP and is used for speed-sensitive messages like clock synchronization and streaming media. Software that uses UDP generally has to be designed on the assumption that some packets that are sent will never arrive.
TCP, or Transmission Control Protocol, is a reliable, ordered, connection-oriented transport protocol. It creates this over IP by a somewhat complicated state machine; a core idea in this is that each received message must be acknowledged by the recipient and will be re-sent by the sender otherwise.
Suppose two computers, and , have established a TCP connection with one another. might send the message This is a triumph
as follows, using (contents, sequence number) to represent a TCP segment:
IP | ||
---|---|---|
sends (This, 1) |
||
→ | ||
receives (This, 1) |
||
sends (ACK, 2) | ||
← | ||
receives (ACK, 2) | ||
sends (” is a “, 2) | ||
(dropped) | ||
sends (triumph, 3) |
||
→ | ||
receives (triumph, 3) |
||
expected message number 2, not received | ||
resends (ACK, 2) | ||
← | ||
receives (ACK, 2), which means segment 2 never arrived… | ||
resends (” is a “, 2) | ||
→ | ||
resends (triumph, 3) as it came after segment 2 |
||
→ | ||
receives (” is a “, 2) | ||
receives (triumph, 3) |
||
sends (ACK, 4) | ||
← | ||
receives (ACK, 4) | ||
sends (FIN) to begin closing connection |
In general, either side may re-send a message if they have not received the expected response, and neither needs to wait for the other’s message to send the next. This way if messages seem to be arriving well, more messages may be sent in groups; if acknowledgments are not arriving, transmission may be slowed down to wait for ACKs before the next sending.
TCP is significantly slower than UDP, requiring overhead to establish and shut down a connection, but provides reliable delivery over unreliable IP. The TCP over IP combination is so prevalent that TCP/IP
is sometimes used as a term for the entire Internet protocol suite.
We are accustomed to navigating the web with Universal Resource Locators (URLs). URL are a special type of URI, and have many components, but three will suffice for this introduction.
A scheme like HTTP or HTTPS; this identifies the protocol used at the application layer.
A hostname, which (conceptually) identifies a particular computer.
A path, which is given to the computer to identify a specific resource.
Note that the hostname here performs a similar function as an IP address, but different audiences. IP addresses are organized to help computers locate one another. Hostnames are organized to help humans locate computers.
When you visit a URL, your browser first needs to convert the hostname in the URL to the IP address of a computer so that it can use TCP/IP to route your request to the appropriate server. It does this by consulting a special mapping data structure, which is maintained online and can be updated by the party owning the hostname. That means it first queries the Internet for the IP address of the hostname before querying it again with the website request itself.
Having such a distributed mapping between what users type and what computers do allows many conveniences, such as the ability to change servers without the need to have everyone learn the new server’s IP addresses. However, it also comes at a price: someone has to decide who gets to change the IP address of courses.grainger.illinois.edu
, which means someone has to know who owns each address and prevent others from taking it. This work requires resources and human judgment calls, meaning it requires money, meaning it costs money to register a hostname.
In the earliest days of the ARPANET, the mapping between hostnames and IP addresses was a single file, HOSTS.TXT
. Each line contained a hostname and an IP address1 More than this, really; it also had several line types (NET
, HOST
, and GATEWAY
) and some information about the type of computer it described. Initially this was stored on one computer and users learned its contents by calling up an operator via telephone and asking the operator for the IP address of a given host. Later these telephone calls were moved to a digital protocol, WHOIS
, but it remained initially on one computer owned by one company, SRI.
As this directory grew in importance, SRI needed to decide who could have which hostnames. Elizabeth Feinler and her team managed WHOIS and created the idea of domain names
to help organize requests. A domain name is a hierarchical sequence of strings separated with periods, with the last string (called the top-level domain
) being the most important. SRI decided that top-level domains would be allocated based to type of business, with .com
for commercial entities, .edu
for educational, .gov
for government, etc.
In the hostname courses.grainger.illinois.edu
,
edu
is the top-level domainillinois.edu
is a subdomain of edu
grainger.illinois.edu
is a subdomain of illinois.edu
courses.grainger.illinois.edu
is a subdomain of grainger.illinois.edu
edu
, illinois
, grainger
, and courses
called labelsAs the Internet grew and a single file on a single server became unwieldy, various proposals were created for how to distribute the load. After many iterations, this has grown into the current Domain Name System (DNS).
DNS is a fairly complicated set of concepts, but the iconic and most important core is the address resolution mechanism.
DNS address resolution uses two basic ideas:
Requests start at a top-level DNS server, which replies with the DNS server of the top-level domain; that DNS server is then asked about the next part of the domain, and so on until the full domain is handled.
Replies are cached (stored locally) so that each query may be sent just once.
Generally when I ask my browser to visit courses.grainger.illinois.edu
, it just contacts 130.126.151.14
without any DNS queries because my browser has it in its DNS cache.
But if there was no cache, the request would do something like the following:
Request | To | Reply |
---|---|---|
. |
my ISP’s DNS server | a list of 13 known top-level-domain DNS servers; we randomly select 192.33.4.12 . |
edu. |
192.33.4.12 |
a list of 13 .edu-domain DNS servers; we randomly select 192.41.162.30 . |
illinois.edu. |
192.41.162.30 |
a list of 3 .illinois.edu-domain DNS servers; we randomly select 192.41.162.30 . |
grainger.illinois.edu. |
192.41.162.30 |
a list of 1 courses.grainger.edu-domain DNS server; we randomly select 3.16.92.183 . |
courses.grainger.illinois.edu. |
3.16.92.183 |
The IP address 130.126.151.14 . |
There are many additional components to the DNS protocol, including digital signatures to validate that they are not spoofed, messages for registering new DNS entries and changing old ones, etc.
As DNS has grown in complexity, DNS servers have grown into ever more-complicated database engines. However, the core DNS lookup process has not changed since the first DNS specifications2 RFC 882 and RFC 883 were released in 1983.
Some IP addresses are specified statically, procured from the set of all possible IP addresses by a single computer and used only by that computer. Others are assigned dynamically as needed, picking an IP address when a computer is placed on the Internet and removing it when it disconnects. The dynamic process is governed by the Dynamic Host Control Protocol, or DHCP.
DHCP is differs in its details for IPv4 and IPv6 and has several nuances, but the most common model is as follows:
The computer sends
no one) to address 255.255.255.255 (
everyone)
Most computers seeing this message ignore it.
The server owning the connection sends
everyone but the target computer should ignore this message.
The computer sends
thanks, I’ll take that IP address
The server sends
great, it’s yours