Skip navigation.
Home

What is TCP/IP, An Introduction

Introduction to TCP/IP

Summary: TCP and IP were developed by a Department of Defense
(DOD) research project to connect a number different networks
designed by different vendors into a network of networks (the
"Internet"). It was initially successful because it
delivered a few basic services that everyone needs (file
transfer, electronic mail, remote logon) across a very large
number of client and server systems. Several computers in a small
department can use TCP/IP (along with other protocols) on a
single LAN. The IP component provides routing from the department
to the enterprise network, then to regional networks, and finally
to the global Internet. On the battlefield a communications
network will sustain damage, so the DOD designed TCP/IP to be
robust and automatically recover from any node or phone line
failure. This design allows the construction of very large
networks with less central management. However, because of the
automatic recovery, network problems can go undiagnosed and
uncorrected for long periods of time.

As with all other communications protocol, TCP/IP is composed
of layers:

  • IP - is responsible for moving packet of data from
    node to node. IP forwards each packet based on a four
    byte destination address (the IP number). The Internet
    authorities assign ranges of numbers to different
    organizations. The organizations assign groups of their
    numbers to departments. IP operates on gateway machines
    that move data from department to organization to region
    and then around the world.
  • TCP - is responsible for verifying the correct
    delivery of data from client to server. Data can be lost
    in the intermediate network. TCP adds support to detect
    errors or lost data and to trigger retransmission until
    the data is correctly and completely received.
  • Sockets - is a name given to the package of
    subroutines that provide access to TCP/IP on most
    systems.

Network of Lowest
Bidders

The Army puts out a bid on a computer and DEC wins the bid.
The Air Force puts out a bid and IBM wins. The Navy bid is won by
Unisys. Then the President decides to invade Grenada and the
armed forces discover that their computers cannot talk to each
other. The DOD must build a "network" out of systems
each of which, by law, was delivered by the lowest bidder on a
single contract.

The Internet Protocol was developed to create a Network of
Networks (the "Internet"). Individual machines are
first connected to a LAN (Ethernet or Token Ring). TCP/IP shares
the LAN with other uses (a Novell file server, Windows for
Workgroups peer systems). One device provides the TCP/IP
connection between the LAN and the rest of the world.

To insure that all types of systems from all vendors can
communicate, TCP/IP is absolutely standardized on the LAN.
However, larger networks based on long distances and phone lines
are more volatile. In the US, many large corporations would wish
to reuse large internal networks based on IBM's SNA. In Europe,
the national phone companies traditionally standardize on X.25.
However, the sudden explosion of high speed microprocessors,
fiber optics, and digital phone systems has created a burst of
new options: ISDN, frame relay, FDDI, Asynchronous Transfer Mode
(ATM). New technologies arise and become obsolete within a few
years. With cable TV and phone companies competing to build the
National Information Superhighway, no single standard can govern
citywide, nationwide, or worldwide communications.

The original design of TCP/IP as a Network of Networks fits
nicely within the current technological uncertainty. TCP/IP data
can be sent across a LAN, or it can be carried within an internal
corporate SNA network, or it can piggyback on the cable TV
service. Furthermore, machines connected to any of these networks
can communicate to any other network through gateways supplied by
the network vendor.

Addresses

Each technology has its own convention for transmitting
messages between two machines within the same network. On a LAN,
messages are sent between machines by supplying the six byte
unique identifier (the "MAC" address). In an SNA
network, every machine has Logical Units with their own network
address. DECNET, Appletalk, and Novell IPX all have a scheme for
assigning numbers to each local network and to each workstation
attached to the network.

On top of these local or vendor specific network addresses,
TCP/IP assigns a unique number to every workstation in the world.
This "IP number" is a four byte value that, by
convention, is expressed by converting each byte into a decimal
number (0 to 255) and separating the bytes with a period. For
example, the PC Lube and Tune server is 130.132.59.234.

An organization begins by sending electronic mail to
Hostmaster@INTERNIC.NET requesting assignment of a network
number. It is still possible for almost anyone to get assignment
of a number for a small "Class C" network in which the
first three bytes identify the network and the last byte
identifies the individual computer. The author followed this
procedure and was assigned the numbers 192.35.91.* for a network
of computers at his house. Larger organizations can get a
"Class B" network where the first two bytes identify
the network and the last two bytes identify each of up to 64
thousand individual workstations. Yale's Class B network is
130.132, so all computers with IP address 130.132.*.* are
connected through Yale.

The organization then connects to the Internet through one of
a dozen regional or specialized network suppliers. The network
vendor is given the subscriber network number and adds it to the
routing configuration in its own machines and those of the other
major network suppliers.

There is no mathematical formula that translates the numbers
192.35.91 or 130.132 into "Yale University" or
"New Haven, CT." The machines that manage large
regional networks or the central Internet routers managed by the
National Science Foundation can only locate these networks by
looking each network number up in a table. There are potentially
thousands of Class B networks, and millions of Class C networks,
but computer memory costs are low, so the tables are reasonable.
Customers that connect to the Internet, even customers as large
as IBM, do not need to maintain any information on other
networks. They send all external data to the regional carrier to
which they subscribe, and the regional carrier maintains the
tables and does the appropriate routing.

New Haven is in a border state, split 50-50 between the
Yankees and the Red Sox. In this spirit, Yale recently switched
its connection from the Middle Atlantic regional network to the
New England carrier. When the switch occurred, tables in the
other regional areas and in the national spine had to be updated,
so that traffic for 130.132 was routed through Boston instead of
New Jersey. The large network carriers handle the paperwork and
can perform such a switch given sufficient notice. During a
conversion period, the university was connected to both networks
so that messages could arrive through either path.

Subnets

Although the individual subscribers do not need to tabulate
network numbers or provide explicit routing, it is convenient for
most Class B networks to be internally managed as a much smaller
and simpler version of the larger network organizations. It is
common to subdivide the two bytes available for internal
assignment into a one byte department number and a one byte
workstation ID.

The enterprise network is built using commercially available
TCP/IP router boxes. Each router has small tables with 255
entries to translate the one byte department number into
selection of a destination Ethernet connected to one of the
routers. Messages to the PC Lube and Tune server (130.132.59.234)
are sent through the national and New England regional networks
based on the 130.132 part of the number. Arriving at Yale, the 59
department ID selects an Ethernet connector in the C& IS
building. The 234 selects a particular workstation on that LAN.
The Yale network must be updated as new Ethernets and departments
are added, but it is not effected by changes outside the
university or the movement of machines within the department.

A Uncertain Path

Every time a message arrives at an IP router, it makes an
individual decision about where to send it next. There is concept
of a session with a preselected path for all traffic. Consider a
company with facilities in New York, Los Angeles, Chicago and
Atlanta. It could build a network from four phone lines forming a
loop (NY to Chicago to LA to Atlanta to NY). A message arriving
at the NY router could go to LA via either Chicago or Atlanta.
The reply could come back the other way.

How does the router make a decision between routes? There is
no correct answer. Traffic could be routed by the
"clockwise" algorithm (go NY to Atlanta, LA to
Chicago). The routers could alternate, sending one message to
Atlanta and the next to Chicago. More sophisticated routing
measures traffic patterns and sends data through the least busy
link.

If one phone line in this network breaks down, traffic can
still reach its destination through a roundabout path. After
losing the NY to Chicago line, data can be sent NY to Atlanta to
LA to Chicago. This provides continued service though with
degraded performance. This kind of recovery is the primary design
feature of IP. The loss of the line is immediately detected by
the routers in NY and Chicago, but somehow this information must
be sent to the other nodes. Otherwise, LA could continue to send
NY messages through Chicago, where they arrive at a "dead
end." Each network adopts some Router Protocol which
periodically updates the routing tables throughout the network
with information about changes in route status.

If the size of the network grows, then the complexity of the
routing updates will increase as will the cost of transmitting
them. Building a single network that covers the entire US would
be unreasonably complicated. Fortunately, the Internet is
designed as a Network of Networks. This means that loops and
redundancy are built into each regional carrier. The regional
network handles its own problems and reroutes messages
internally. Its Router Protocol updates the tables in its own
routers, but no routing updates need to propagate from a regional
carrier to the NSF spine or to the other regions (unless, of
course, a subscriber switches permanently from one region to
another).

Undiagnosed Problems

IBM designs its SNA networks to be centrally managed. If any
error occurs, it is reported to the network authorities. By
design, any error is a problem that should be corrected or
repaired. IP networks, however, were designed to be robust. In
battlefield conditions, the loss of a node or line is a normal
circumstance. Casualties can be sorted out later on, but the
network must stay up. So IP networks are robust. They
automatically (and silently) reconfigure themselves when
something goes wrong. If there is enough redundancy built into
the system, then communication is maintained.

In 1975 when SNA was designed, such redundancy would be
prohibitively expensive, or it might have been argued that only
the Defense Department could afford it. Today, however, simple
routers cost no more than a PC. However, the TCP/IP design that,
"Errors are normal and can be largely ignored,"
produces problems of its own.

Data traffic is frequently organized around "hubs,"
much like airline traffic. One could imagine an IP router in
Atlanta routing messages for smaller cities throughout the
Southeast. The problem is that data arrives without a
reservation. Airline companies experience the problem around
major events, like the Super Bowl. Just before the game, everyone
wants to fly into the city. After the game, everyone wants to fly
out. Imbalance occurs on the network when something new gets
advertised. Adam Curry announced the server at
"mtv.com" and his regional carrier was swamped with
traffic the next day. The problem is that messages come in from
the entire world over high speed lines, but they go out to
mtv.com over what was then a slow speed phone line.

Occasionally a snow storm cancels flights and airports fill up
with stranded passengers. Many go off to hotels in town. When
data arrives at a congested router, there is no place to send the
overflow. Excess packets are simply discarded. It becomes the
responsibility of the sender to retry the data a few seconds
later and to persist until it finally gets through. This recovery
is provided by the TCP component of the Internet protocol.

TCP was designed to recover from node or line failures where
the network propagates routing table changes to all router nodes.
Since the update takes some time, TCP is slow to initiate
recovery. The TCP algorithms are not tuned to optimally handle
packet loss due to traffic congestion. Instead, the traditional
Internet response to traffic problems has been to increase the
speed of lines and equipment in order to say ahead of growth in
demand.

TCP treats the data as a stream of bytes. It logically assigns
a sequence number to each byte. The TCP packet has a header that
says, in effect, "This packet starts with byte 379642 and
contains 200 bytes of data." The receiver can detect missing
or incorrectly sequenced packets. TCP acknowledges data that has
been received and retransmits data that has been lost. The TCP
design means that error recovery is done end-to-end between the
Client and Server machine. There is no formal standard for
tracking problems in the middle of the network, though each
network has adopted some ad hoc tools.

Need to Know

There are three levels of TCP/IP knowledge. Those who
administer a regional or national network must design a system of
long distance phone lines, dedicated routing devices, and very
large configuration files. They must know the IP numbers and
physical locations of thousands of subscriber networks. They must
also have a formal network monitor strategy to detect problems
and respond quickly.

Each large company or university that subscribes to the
Internet must have an intermediate level of network organization
and expertise. A half dozen routers might be configured to
connect several dozen departmental LANs in several buildings. All
traffic outside the organization would typically be routed to a
single connection to a regional network provider.

However, the end user can install TCP/IP on a personal
computer without any knowledge of either the corporate or
regional network. Three pieces of information are required:

  1. The IP address assigned to this personal computer
  2. The part of the IP address (the subnet mask) that
    distinguishes other machines on the same LAN (messages
    can be sent to them directly) from machines in other
    departments or elsewhere in the world (which are sent to
    a router machine)
  3. The IP address of the router machine that connects this
    LAN to the rest of the world.

In the case of the PCLT server, the IP address is
130.132.59.234. Since the first three bytes designate this
department, a "subnet mask" is defined as 255.255.255.0
(255 is the largest byte value and represents the number with all
bits turned on). It is a Yale convention (which we recommend to
everyone) that the router for each department have station number
1 within the department network. Thus the PCLT router is
130.132.59.1. Thus the PCLT server is configured with the values:

  • My IP address: 130.132.59.234
  • Subnet mask: 255.255.255.0
  • Default router: 130.132.59.1

The subnet mask tells the server that any other machine with
an IP address beginning 130.132.59.* is on the same department
LAN, so messages are sent to it directly. Any IP address
beginning with a different value is accessed indirectly by
sending the message through the router at 130.132.59.1 (which is
on the departmental LAN).

Additional information is available in self-study courses from
SRA (1-800-SRA-1277)

  • TCP/IP [34610]

Copyright 1995 PCLT -- Introduction
to TCP/IP -- H. Gilbert

Title

Hi