The TCP/IP Protocol Stack

TCP/IP is the world's most widely-used non-proprietary protocol suite because it enables computers using diverse hardware and software platforms, on different types of networks, to communicate. The protocols work equally well in both LANs and WANs. TCP/IP is a collection of protocols named after its two best-known and most important protocols, the Transmission Control Protocol (TCP) and the Internet Protocol (IP). As well as these relatively low-level protocols, TCP/IP includes several higher level protocols that facilitate common applications such as electronic mail, terminal emulation, and file transfer. As we have seen, the Internet protocols used today were originally developed as part of the ARPANET research project which started in the 1960s and led to the emergence of the global network of networks we call the Internet. Each Internet protocol, together with any subsequent amendments, is described in a document known as a Request For Comments (RFC). A list of RFCs is available at: http://www.ietf.org/rfc.html.

TCP/IP layers

The TCP/IP protocol suite can be modelled as a layered protocol stack, allowing TCP/IP to be compared with other layered models such as the OSI Reference Model. The TCP/IP model has four layers. From lowest to highest, these are the link layer, the internet layer, the transport layer, and the application layer, as shown below.

The TCP/IP layers and protocol stack

The TCP/IP layers and protocol stack

The link layer is undefined in some ways, because it may consist of virtually any low-level network technology, including Ethernet, X.25, Point-to-Point Protocol (PPP), or whatever happens to be implemented on a particular network or subnet link. The link layer roughly equates to the data-link and physical layers of the OSI 7-layer reference model, and provides the interface with the underlying network hardware.

The internetwork layer provides addressing and routing functions that ensures messages are delivered to their destination. Internet Protocol (IP) is the most important protocol in this layer. It is a connectionless, unreliable protocol that does not provide flow control or error handling, and attempts to deliver datagrams (in the form of IP packets) on a best-effort basis. Network devices called routers forward incoming datagrams according to the destination IP address specified within the IP packet. The internet layer corresponds more or less to the network layer of the OSI model. Other protocols at this layer include Internet Control Messaging Protocol (ICMP) and Internet Group Management Protocol (IGMP).

The transport layer oversees the end-to-end transfer of data, and can handle a number of data streams simultaneously. The main transport layer protocol is Transmission Control Protocol (TCP), which provides a reliable, connection-oriented service. User Datagram Protocol (UDP) provides an unreliable, connectionless service (delivery is not guaranteed, but UDP is useful for applications for which speed is more important than reliability). The transport layer roughly corresponds to its namesake in the OSI model.

An application layer protocol is specific to a particular type of application (e.g. file transfer, electronic mail, network management etc.) and is sometimes embodied within the application's client software, although it could also be implemented within the operating system software. The interface between an application layer protocol and a transport layer protocol is defined with reference to port numbers and sockets (more about this later). The application layer effectively combines the functionality of the application, presentation and session layers of the OSI model.

Each layer in the TCP/IP model handles a particular set of problems involving some aspect of sending data between distributed user applications, i.e. applications that are running on different computers, and often on different networks. Each of the lower three layers provides services to the layer immediately above it, while the application layer provides an interface between the user application above it and the communication-oriented layers below it. As the raw data moves from the application itself down through the various layers, it is wrapped up (or encapsulated) within protocol data units (PDUs) created by each of the protocols it encounters. The names commonly used to refer to these protocol data units tend to vary. At the internet layer, for example, they are called packets or datagrams. At the link layer, they are more often called frames.

The diagram below illustrates how successive headers are added by protocols working at each layer. Data from an application is passed down to the appropriate application layer protocol, which encapsulates the data within a protocol data unit (PDU) by adding some header information. The entire PDU is then passed down to the transport layer protocol, and undergoes a similar process here. This encapsulation is repeated for the internet layer and the link layer. The frame that is built by the link layer is then sent on the first leg of its journey (to a network switch or router, for example) via some physical transmission medium as a stream of bits.

Encapsulation of data in the TCP/IP protocol stack

Encapsulation of data in the TCP/IP protocol stack

The Internet - indeed any large internetwork - consists of a number of autonomous networks connected together by gateways. A gateway is a special kind of computer called a router. Each router has connections to at least two networks, including its own home network. The router's main function is to examine incoming datagrams from its own and other networks, and send them out again along the correct path according to the network number indicated by the each datagram's destination IP address.

A computer on any of the networks in an internetwork should be able to send data to, or receive data from, any other computer, whether on the same network or on a different network. Datagrams must often pass through many gateways before getting to their final destination. The datagrams may follow various routes from one computer to another, depending on the best routing options available at any given time. Routers constantly gather information about the routes available to other networks, and they use this information to decide where to send an incoming packet on the next leg (or hop) of its journey. A client application wishing to send a request to a server somewhere on the internetwork does not need to know how to get to that network, it only needs to know the destination computer's IP address and the hardware (MAC) address of its own local network gateway router.

Consider a client-server process in which a user types the IP address of a Web server into the address box of a Web browser (we will be looking at URLs and domain names in due course, by the way). All being well, the document that is retrieved and displayed in the browser window will be the default document from the Web root directory on the server in question, usually a file named "index.html". But how does this happen?

Once the server's IP address has been entered into the address box and the user has hit the ENTER key, the browser software asks the TCP protocol to establish a connection with port 80 (the HTTP port) on the server (any available port number on the client computer can be used for the client end of the connection). The connection must be established before any data can be sent. The browser's HTTP protocol now constructs a HTTP request packet containing the URL (in this example the IP address typed into the browser's address box), and sends the request packet to TCP.

TCP constructs its own protocol data unit (PDU) by attaching a TCP header to the HTTP request packet. This PDU is then passed to the Internet Protocol (IP). IP constructs an IP packet (or datagram) by attaching its own header to the TCP protocol data unit. How datagram is actually transferred across the local network from the client computer to the gateway router will depend on the underlying network technology. If the LAN on which the client computer resides is an Ethernet network, for example, the datagram will traverse the network encapsulated within an Ethernet frame.

As the data moves down through the protocol stack, each protocol encapsulates the data (and any headers already attached to it) by adding its own header. The Transmission Control Protocol adds a TCP header that includes the source and destination port numbers. The Internet Protocol adds the source and destination IP addresses. The link layer protocol adds a source and destination hardware (MAC) address. The resulting packet (or frame) is then transmitted over a physical transmission medium to another node on the local network.

The initial destination will be the local network's default gateway router (this is normally specified within the TCP/IP network configuration file on the client computer). Because hardware addresses are used by link layer protocols to find computers and other networked devices on the local network, the link layer protocol active on the client computer will need to resolve the gateway router's IP address to a hardware address in order to send the frame (we will see how this is done later in this section). As data travels through a network or internetwork, the source and destination hardware address will constantly change as the data traverses the individual network and subnet links that makes up the transmission path from the point of origin to the final destination. The source and destination IP address, on the other hand, will remain the same.

The source and destination hardware addresses change as the packet traverses the internetwork, while the source and destination IP addresses remain constant

The source and destination hardware addresses change as the packet traverses
the internetwork, while the source and destination IP addresses remain constant

As the data moves upwards through the protocol stack on the destination computer, each protocol reads the control information contained within its own header before stripping the header from the frame or packet, and passing the remaining data up to the next protocol in the stack. The link layer, for example, removes the link layer framing and passes the frame?s payload (an IP packet) to IP. IP in turn removes the IP header and passes the resulting protocol data unit to TCP. Once the application layer protocol has removed its header, only the data remains and is passed to the appropriate application.