Previous Contents Next

The Internet

The Internet is a network of networks. Their interconnection is organized as a hierarchy of domains, subdomains, and so on, through interfaces. An interface is the hardware in a computer that allows it to be connected (typically, an Ethernet card). Some computers may have several interfaces. Each interface has a unique IP address that respects, in general, the interconnection hierarchy. Message routing is also organized hierarchically: from domain to domain; then from domain to subdomains, and so on, until a message reaches its destination interface. Besides their interface addresses, computers usually also have a name, as do domains and subdomains. Some machines have a particular role in the network:
connect one network to another;
use their knowledge of the topology of the Internet to route data;
name servers
track the correspondence between machine names and network addresses.
The purpose of the Internet protocol (i.e., of the IP) is to make the network of networks into a single entity. This is why one can speak of the Internet. Any two machines connected via the Internet can communicate. Many kinds of machines and systems coexist on the Internet. All of them use IP protocols and most of them, the UDP and TCP layers.

The different protocols and services used by the Internet are described in RFC's (Requests For Comments), which can be found on the Jussieu mirror site:


Internet Protocols and Services

The unit of transfer used by the IP protocol is the datagram or packet. This protocol in unreliable: it does not assure proper order, safe arrival, or non-duplication of transmitted packets. It only deals with correct routing of packets and signaling of errors when a packet is unable to reach its destination. Addresses are coded into 32 bits in the current version of the protocol: IPv4. These 32 bits are divided into four fields, each containing values between 0 and 255. IP addresses are written with the four fields separated by periods, for example:

The IP protocol is in the midst of an important change made necessary by the exhaustion of address space and the growing complexity of routing problems due to the expansion of the Internet. The new version of the IP protocol is IPv6, which is described in [Hui97].

Above IP, two protocols allow higher-level transmissions: UDP (User Datagram Protocol, and TCP (Transfer Control Protocol). These two protocols use IP for communication between machines, also allowing communication between applications (or programs) running on those machines. They deal with correct transmission of information, independent of contents. The identification of applications on a machine is done via a port number.

UDP is a connectionless, unreliable protocol: it is to applications as IP is to interfaces. TCP is a connection-oriented, reliable protocol: it manages acknowledgement, retransmission, and ordering of packets. Further, it is capable of optimizing transmission by a windowing technique.

The standard services (applications) of the Internet most often use the client-server model. The server manages requests by clients, offering them a specific service. There is an asymmetry between client and server. The services establish high-level protocols for keeping track of transmitted contents. Among the standard services, we note: Other services use the client-server model: Communication between applications takes place via sockets. Sockets allow communication between processes residing on possibly different machines. Different processes can read and write to sockets.

The Unix Module and IP Addressing

The Unix library defines the abstract type inet_addr representing Internet addresses, as well as two conversion functions between an internal representation of addresses and strings:

# Unix.inet_addr_of_string ;;
- : string -> Unix.inet_addr = <fun>
# Unix.string_of_inet_addr ;;
- : Unix.inet_addr -> string = <fun>

In applications, Internet addresses and port numbers for services (or service numbers) are often replaced by names. The correspondence between names and address or number is managed using databases. The Unix library provides functions to request data from these databases and provides datatypes to allow storage of the obtained information. We briefly describe these functions below.

Address tables.
The table of addresses (hosts database) contains the assocation between machine name(s) and interface address(es). The structure of entries in the address table is represented by:

# type host_entry =
{ h_name : string;
h_aliases : string array;
h_addrtype : socket_domain;
h_addr_list : inet_addr array } ;;
The first two fields contain the machine name and its aliases; the third contains the address type (see page ??); the last contains a list of machine addresses.

A machine name is obtained by using the function:

# Unix.gethostname ;;
- : unit -> string = <fun>
# let my_name = Unix.gethostname() ;;
val my_name : string = ""

The functions that query the address table require an entry, either the name or the machine address.

# Unix.gethostbyname ;;
- : string -> Unix.host_entry = <fun>
# Unix.gethostbyaddr ;;
- : Unix.inet_addr -> Unix.host_entry = <fun>
# let my_entry_byname = Unix.gethostbyname my_name ;;
val my_entry_byname : Unix.host_entry =
{Unix.h_name=""; Unix.h_aliases=[|"estephe"|];
Unix.h_addrtype=Unix.PF_INET; Unix.h_addr_list=[|<abstr>|]}
# let my_addr = my_entry_byname.Unix.h_addr_list.(0) ;;
val my_addr : Unix.inet_addr = <abstr>

# let my_entry_byaddr = Unix.gethostbyaddr my_addr ;;
val my_entry_byaddr : Unix.host_entry =
{Unix.h_name=""; Unix.h_aliases=[|"estephe"|];
Unix.h_addrtype=Unix.PF_INET; Unix.h_addr_list=[|<abstr>|]}

# let my_full_name = my_entry_byaddr.Unix.h_name ;;
val my_full_name : string = ""
These functions raise the Not_found exception in case the request fails.

Table of services.
The table of services contains the correspondence between service names and port numbers. The majority of Internet services are standardized. The structure of entries in the table of services is:

# type service_entry =
{ s_name : string;
s_aliases : string array;
s_port : int;
s_proto : string } ;;
The first two fields are the service name and its eventual aliases; the third field contains the port number; the last field contains the name of the protocol used.

A service is in fact characterized by its port number and the underlying protocol. The query functions are:

# Unix.getservbyname ;;
- : string -> string -> Unix.service_entry = <fun>
# Unix.getservbyport ;;
- : int -> string -> Unix.service_entry = <fun>
# Unix.getservbyport 80 "tcp" ;;
- : Unix.service_entry =
{Unix.s_name="www"; Unix.s_aliases=[|"http"|]; Unix.s_port=80;
# Unix.getservbyname "ftp" "tcp" ;;
- : Unix.service_entry =
{Unix.s_name="ftp"; Unix.s_aliases=[||]; Unix.s_port=21; Unix.s_proto="tcp"}
These functions raise the Not_found exception if they cannot find the service requested.

Previous Contents Next