关于Socket编程,在《Linux从入门到精通》里有简单的介绍,更详细的可以参考
《UNIX网络编程 卷1:联网的API:套接字与XTI 第2版》清华影印版,其中还讲了
线程(Thread)编程。极好的参考书,可惜没人把它翻译过来。
胡淑瑜翻译了一篇“网络编程”,我把它收集进来了。
如有更新,请参考胡先生的个人主页。
________________________________________________________________________________
| 版权声明 |
| |
| 1、本文可以转载、修改及引用,但请保留本声明和其后所付英文原文。 |
| 2、任何情况下,作者和译者姓名不可删除。 |
| 3、任何情况下,本文不可进入商业领域。 |
| |
| 胡淑瑜 |
| (husuyu@linux.cqi.com.cn) |
| |
| 1998年11月 |
|______________________________________________________________________________|
第59章
目录
网络程序设计
端口(Ports)和套接字(Sockets)
套接字程序设计
socket()系统调用(System Call)
bind()系统调用
listen()系统调用
accept()系统调用
setsockopt和getsockopt系统调用
connect()系统调用
程序清单59.1服务器端(Server Side)面向套接字(socket-oriented)协议
程序清单59.2客户端函数(The lient Side function)
无连接(Connectionless)套接字程序设计
程序清单59.3服务端
注意
记录(Record)和文件锁定(Locking)
进程间通信
小结
-------------------------------------------------------------------------------
--第59章--
网络程序设计
作者 Kamran Husain,Tim Parker
译者 胡淑瑜
本章内容
端口和套接字
套接字程序设计
记录和文件锁定
进程间通信
阅读本章需你具有如下网络程序设计的基本概念
端口和套接字
记录和文件锁定
进程间通信
本文不可能在几页之内就能与你说清网络程序设计的全部内容.事实上,一本第一卷就有
800页的专门描述网络程序设计的参考书是最有用的.如果你真想进行网络编程的话,你需要
具有编译器,TCP/IP和网络操作系统的大量经验.另外,你还需有极大的耐心.
欲知TCP/IP详细内容,请参见Tim Parker所著之<<自学TCP/IP十四天>> (Sams Publish-
ing).
端口和套接字
网络程序设计全靠套接字接受和发送信息.尽管套接字这个词好象显得有些神秘,但其实
这个概念极易理解.
大多数网络应用程序使用两个协议:传输控制协议(TCP)和用户数据包协议(UDP).他们都
使用一个端口号以识别应用程序.端口号为主机上所运行之程序所用,这样就可以通过号码
象名字一样来跟踪每个应用程序.端口号让操作系统更容易的知道有多少个应用程序在使用
系统,以及哪些服务有效.
理论上,端口号可由每台主机上的管理员自由的分配.但为了更好的通信通常采用一些约
定的协议.这些协议使能通过端口号识别一个系统向另一个系统所请求的服务的类型.基于
如此理由,大多数系统维护一个包含端口号及它们所提供哪些服务的文件.
端口号被从1开始分配.通常端口号超出255的部分被本地主机保留为私有用途.1到255之
间的号码被用于远程应用程序所请求的进程和网络服务.每个网络通信循环地进出主计算机
的TCP应用层.它被两个所连接的号码唯一地识别.这两个号码合起来叫做套接字.组成套接
字的这两个号码就是机器的IP地址和TCP软件所使用的端口号.
因为网络通讯至少包括两台机器,所以在发送和接收的机器上都存在一个套接字.由于每
台机器的IP地址是唯一的,端口号在每台机器中也是唯一的,所以套接字在网络中应该是唯
一的.这样的设置能使网络中的两个应用程序完全的基于套接字互相对话.
发送和接收的机器维护一个端口表,它列出了所有激活的端口号.两台机器都包括一个进
程叫做绑定,这是每个任务的入口,不过在两台机器上恰恰相反.换句话说,如果一台机器的
源端口号是23而目的端口号被设置成25,那么另一台机器的源端口号设置成25目的端口号设
置成23.
套接字程序设计
Linux支持伯克利(BSD)风格的套接字编程.它同时支持面向连接和不连接类型的套接字.
在面向连接的通讯中服务器和客户机在交换数据之前先要建立一个连接.再不连接通讯中数
据被作为信息的一部分被交换.无论那一种方式,服务器总是最先启动,把自己绑定(Banding
)在一个套接字上,然后侦听信息.服务器究竟怎样试图去侦听就得依靠你编程所设定的连接
的类型了.
你需要了解一些系统调用
socket()
bind()
listen()
accept()
setsockopt()和getsockopt()
connect()
sendto()
recvfrom()
我们将在以下的例子中使用这些系统调用.
socket()系统调用
socket()系统调用为客户机或服务器创建一个套接字,套接字函数在如下定义:
#include<sys/types.h>
#include<sys/socket.h>
int socket(int family, int type, int protocol)
在Linux中family=AF_UNIX.type可以是SOCK_STREAM它是可靠的虽然通讯速度较慢,也可
以是SOCK_DGRAM它通讯速度较快但不可靠.如果type=SOCK_STREAM那么protocol=IPPROTO_
TCP.如果type=SOCK_DGRAM那么protocol=IPPROTO_UDP.
如果出错,函数将返回-1.否则返回一个套接字描述符你可以在程序后面的调用中通过套
接字描述符使用这个套接字.
套接字创建时没有指定名字.客户机用套接字的名字读写它.这就是如下绑定函数所要做
之事.
bind()系统调用
bind()系统调用为没有名字的套接字分配一个名字.绑定函数是这样定义的:
#include<sys/types.h>
#include<sys/socket.h>
int bind(int sockfd, struct sockaddr *saddr, int addrlen)
第一个参数是一个套接字描述符.第二个参数是名字所用的一个结构.第三个参数是结构
的大小.
现在你已经为你的客户机或服务器限定了一个地址,你可以connect()服务器或是在服务
器上侦听了.如果你的程序是一个服务器,那么它把自己设置为侦听然后等待连接.让我们来
看看哪些函数能让你试图这样做.
listen()系统调用
listen()系统调用被服务器所使用.下面有它的定义:
#include<sys/types.h>
#include<sys/socket.h>
int listen(int sockfd, int backlog);
sockfd是套接字描述符.backlog是在一时间内尚未被决定是否拒绝的连接的号码.一般
使用标准值5.如发生错误则返回值小于1.
如果这个调用成功,你就已经可以接受连接了.
accept()系统调用
accept()调用被服务器用于接受任何从客户机connect()调用所引入的信息.必须明白的
是,如果没有接受到连接这个函数将不返回任何值.它是象这样定义的:
#include<sys/types.h>
#include<sys/socket.h>
int accept(int sockfd, struct sockaddr *peeraddr, int addrlen)
除peeraddr指向发出连接请求的客户机的信息外,其它参数和bind()系统调用的相同.在
信息引入的基础上,peeraddr所指向的结构的域被填上相应的值.
setsockopt()和getsockopt()系统调用
Linux所提供的socket库含有一个错误(bug).此错误表现为你不能为一个套接字重新启
用同一个端口号,即使在你正常关闭该套接字以后.例如,比方说,你编写一个服务器在一个
套接字上等待的程序.服务器打开套接字并在其上侦听是没有问题的.无论如何,总有一些原
因(不管是正常还是非正常的结束程序)使你的程序需要重新启动.然而重启动后你就不能把
它绑定在原来那个端口上了.从bind()系统调用返回的错误代码总是报告说你试图连接的端
口已经被别的进程所绑定.
问题就是Linux内核在一个绑定套接字的进程结束后从不把端口标记为未用.在大多数UN
IX系统中,端口可以被一个进程重复使用,甚至可以被其它进程使用.
在Linux中绕开这个问题的办法是,但套接字已经打开但尚未有连接的时候用setsockopt
()系统调用在其上设定选项(options).setsockopt()调用设置选项而getsockopt()从给定
的套接字取得选项.
这里是这些调用的语法:
#include<sys/types.h>
#include<sys/socket.h>
int getsockopt(int sockfd, int level, int name
, char *value, int *optlen)
int setsockopt(int sockfd, int level, int name
, char *value, int *optlen)
sockfd必须是一个已打开的套接字.level是函数所使用的协议标准(protocol level)(T
CP/IP协议使用IPPROTO_TCP,套接字标准的选项实用SOL_SOCKET),选项的名称(name)在套接
字说明书中(man page)有详细说明.*value指向为getsockopt()函数所获取的值或setsocko
pt()函数所设置的值的地址.optlen指针指向一个整数,该整数包含参数以字节计算的长度.
其值被getsockopt()设置且其值必须被程序员设定当使用一个经由setsockopt().
选项的所有细节可以在使用手册中setsockopt的第二节(setsockopt(2))找到.
现在我们再回到Linux的错误上来.当你打开一个套接字时必须同时用下面的代码段来调
用setsockopt()函数:
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt
,&len);
#endif
只有当你想让程序不光是在Linux系统下使用时,#ifdef和#endif描述才是必须的.有些U
NIX系统可能不支持或不需要SO_REUSEADDR标志.
connect()系统调用
connect()调用被在面向连接的系统中客户机连接服务器时使用.connect()调用必须被
用在bind()调用之后.它是这样定义的:
#include<sys/types.h>
#include<sys/socket.h>
int connect(int sockfd, struct sockaddr *servs
addr, int addrlen)
除servsaddr外所有的参数都和bind调用相同,servsaddr指向客户机所连接的服务器的
信息.当接到请求后,accept调用为服务器创建一个新的套接字.然后服务器就可以fork()一
个新进程然后再去等待其它连接.在服务器端你可以象程序清单59.1所显示的那样编写代码
程序清单59.1
面向套接字协议的服务器端
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/in.h>
#include <linux/net.h>
#define MY_PORT 6545
main(int argc, char *argv[])
{
int sockfd, newfd;
int cpid; /* child id */
struct sockaddr_in servaddr;
struct sockaddr_in clientInfo;
if ((sockfd = socket(AF_INET, SOCK_STREAM, 0) < 0)
{
myabort("Unable to create socket");
}
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
bzero((char *)&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_family = htons(MY_PORT);
/*
* The htonl (for a long integer) and htons (for short integer) convert
* a host oriented byte order * into a network order.
*/
if (bind(sockfd,(struct sockaddr *)&servaddr,sizeof(struct sockaddr)) < 0)
{
myabort("Unable to bind socket");
}
listen(sockfd,5);
for (;;)
{
/* wait here */
newfd=accept(sockfd,(struct sockaddr *)&clientInfo,
sizeof(struct sockaddr);
if (newfd < 0)
{
myabort("Unable to accept on socket");
}
if ((cpid = fork()) < 0)
{
myabort("Unable to fork on accept");
}
else if (cpid == 0) { /* child */
close(sockfd); /* no need for original */
do_your_thing(newfd);
exit(0);
}
close(newfd); /* in the parent */
}
在面向连接的协议的程序中,服务器执行以下函数:
调用socket()函数创建一个套接字.
调用bind()函数把自己绑定在一个地址上
调用listen()函数侦听连接
调用accept()函数接受所有引入的请求
调用read()函数获取引入的信息然后调用write()回答
现在让我们来看看客户端所要做的事情,见程序清单59.2.
程序清单59.2
客户端函数
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/in.h>
#include <linux/net.h>
#define MY_PORT 6545
#define MY_HOST_ADDR "204.25.13.1"
int getServerSocketId()
{
int fd, len;
struct sockaddr_in unix_addr;
/* create a Unix domain stream socket */
if ( (fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0)
{
return(-1);
}
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
/* fill socket address structurew/our address */
memset(&unix_addr, 0, sizeof(unix_addr));
unix_addr.sin_family = AF_INET;
/* convert internet address to binary value*/
unix_addr.sin_addr.s_addr = inet_addr(MY_HOST_ADDR);
unix_addr.sin_family = htons(MY_PORT);
if (bind(fd, (struct sockaddr *) &unix_addr, len) < 0)
return(-2);
memset(&unix_addr, 0, sizeof(unix_addr));
if (connect(fd, (struct sockaddr *) &unix_addr, len) < 0)
return(-3);
return(fd);
}
在面向连接的通信中客户机要做如下一些事:
调用socket()函数创建一个套接字
调用connect()函数试图连接服务器
如果连接成功调用write()函数请求数据,调用read()函数接收引入的应答
不连接(Connectionless)套接字程序设计
现在让我们来考虑一下不连接的信息交换.其服务器端的原理和面向连接的协议有所不
同.服务器并不调用listen和accept而是调用recvfrom().同样,服务器用sendto()函数来
应答信息.服务器端程序见程序清单59.3.
程序清单59.3
服务器端
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/in.h>
#include <linux/net.h>
#define MY_PORT 6545
#define MAXM 4096
char mesg[MAXM];
main(int argc, char *argv[])
{
int sockfd, newfd;
int cpid; /* child id */
struct sockaddr_in servaddr;
struct sockaddr_in clientInfo;
if ((sockfd = socket(AF_INET, SOCK_STREAM, 0) < 0)
{
myabort("Unable to create socket");
}
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
bzero((char *)&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_family = htons(MY_PORT);
/*
* The htonl (for a long integer) and htons (for short integer) convert
* a host oriented byte order * into a network order.
*/
if (bind(sockfd,(struct sockaddr *)&servaddr,sizeof(struct sockaddr)) < 0)
{
myabort("Unable to bind socket");
}
for (;;)
{
/* wait here */
n = recvfrom(sockfd, mesg, MAXM, 0,
(struct sockaddr *)&clientInfo,
sizeof(struct sockaddr));
doSomethingToIt(mesg);
sendto(sockfd,mesg,n,0,
(struct sockaddr *)&clientInfo,
sizeof(struct sockaddr));
}
}
看见了吗,处理每个消息只调用了两个函数,这比面向连接的协议更容易.但你必须,无论
如何,得在同一时间处理每个消息,因为消息从多台客户机向服务器涌来.而在面向连接的协
议中,子进程总是知道每条消息从哪里来.
客户机同样不能调用connect()系统调用.但是客户机可以直接调用sendto()函数.客户
机端和服务器端大致相同.只是它在调用recvfrom()之前调用sendto():
#include <sys/types.h>
#include <sys/socket.h>
int sendto((int sockfd,
const void *message__, /* the pointer to message */
int length, /* of message */
unsigned int type, /* of routing, leave 0 */
const struct sockaddr * client, /* where to send it */
int length ); /* of sockaddr */
注意:如果你使用的是BSD系统,请使用sendto()系统调用,不要使用sendmsg(),因为send
to()性能更好.
如出错则返回-1,不过仅能检查出本地错误.
recvfrom()系统调用是这样定义的:
#include <sys/types.h>
#include <sys/socket.h>
int recvfrom(int sockfd,
const void *message__, /* the pointer to message */
int length, /* of message */
unsigned int flags, /* of routing, leave 0 */
const struct sockaddr * client, /* where to send it */
int length ); /* of sockaddr */
如果一个信息大得缓冲区都放不下,那么附加信息将被砍掉.该调用可以立即返回,也可
以永久的等待.这取决于你把flags设置成什么类型.你甚至可以设置超时(timeout)值.在说
明书(man pages)中可以找到recvfrom的更多信息.
在此你已学会了如何利用Linux的性能上的优点设计网络应用程序的基本知识.我们不打
算再进一步的描述更复杂的网络编程了.获得更多细节信息的一个极好的起点是参考W. Ric
hard Stevens 的<<Unix 网络程序设计>>(Prentice Hall, 1990).此书乃众多大学所使用
的经典教材,内容极为详尽.
记录和文件锁定
当两个进程共享一个文件时,这之中存在一件非常危险的事情.如果一个进程改变了文件
目录那么必然影响到另一个进程.基于此理由,大多数操作系统采用一个互斥原则(mutually
exclusive principle):当一个进程拥有一个文件时,其它进程就不能再碰这个文件.这叫做
文件锁定.
这个技术非常容易实现.通常所发生的事是,所谓"锁定文件"就是创建一个和源文件名同
名的文件再加上.lock扩展名.这就告诉其它进程这个文件不能再碰了.Linux假脱机打印系
统以及UUCP就是这样实现文件锁定的.这可能是一种粗暴的方法,但编程上非常简单.
不幸的是,你有几个进程要同时迅速的处理同一条信息时,这项技术对你并不实用.因为
等待文件打开和关闭所产生的延时将变得很长.同样一个进程如不能正确的释放文件,其它
进程将会挂在那里一直等待下去以获得存取权限.
由于这个原因,通常使用记录锁定.用记录锁定,一个大文件的一小部分被锁定以防止两
个进程同时改变它.如果有必要的话,记录锁定可以让多个进程同时存取相同文件的不同部
分记录.当然实现记录锁定编程要比实现文件锁定更复杂.
通常,要实现记录锁定你需要用到文件偏移量或是到文件起始处的字符数.在大多数程序
中,一个范围内的字符被锁定.程序记录锁定范围的起始处和长度,并保存在其它进程能查询
到的地方.不管是编写文件锁定还是记录锁定都需要对操作系统有很好的理解.但是并不难.
特别是可以从Internet,网络程序设计指导书和BBS上很容易地获得成千的程序.你可以察看
这些程序的源代码.
进程间通信
网络程序设计中通常包括两个或更多的进程将互相对话(interprocess communications
).因此进程通信的方法在网络程序设计中是极为重要的.网络程序设计在一些重要的方面不
同于一般程序设计通常所使用的方法.一个传统的程序可以通过全局变量或函数调用和不同
的模块(甚至同一机器上的其它应用程序)对话.但是在网络上却不行.
网络程序设计的一个重要的目标是保证进程间不互相干涉.否则系统可能被挂起或自锁.
因此,进程间必须使用简洁有效的方法进行通信.在此方面,UNIX具有非常显著的健壮性.因
为UNIX的许多基本性能如管道,队列等都非常适合网络.
和单个的应用程序代码相比,写进程间通信的代码十分复杂.如果你想写这类程序,可以
学习网络程序设计指导书和BBS站点上的例子程序.以了解这些任务是何以完成的.
小结
很少有人想写网络应用程序,因此进程的细节最好留给那些想写的人.实践和查阅大量的
例子程序是开始写网络代码的最好的方法.但是要掌握这门技术却要花许多年时间.
-------------------------------英文原文--------------------------------
- 59 -
Network Programming
Ports and Sockets
Socket Programming
The socket() System Call
The bind() System Call
The listen() System Call
The accept() System Call
The setsockopt() and getsockopt() System Calls
The connect() System Call
Listing 59.1. The server side for a socket-oriented protocol.
Listing 59.2. The client side function.
Connectionless Socket Programming
Listing 59.3. The server side.
NOTE
Record and File Locking
Interprocess Communications
Summary
- 59 -
Network Programming
by Kamran Husain and Tim Parker
IN THIS CHAPTER
Ports and Sockets
Socket Programming
Record and File Locking
Interprocess Communications
This chapter looks at the basic concepts you need for network programming:
Ports and sockets
Record and file locking
Interprocess communications
It is impossible to tell you how to program applications for a network in just a
few pages. Indeed, the best available reference to network programming takes
almost 800 pages in the first volume alone! If you really want to do network
programming, you need a lot of experience with compilers, TCP/IP, and network
operating systems--and you need a great deal of patience.
For details on TCP/IP, check the book Teach Yourself TCP/IP in 14 Days, by Tim
Parker (Sams Publishing).
Ports and Sockets
Network programming relies on the use of sockets to accept and transmit
information. Although there is a lot of mystique about sockets, the concept is
actually simple to understand.
Most applications that use the two primary network protocols, Transmission
Control Protocol (TCP) and User Datagram Protocol (UDP) have a port number that
identifies the application. A port number is used for each different application
the machine is handling, so it can keep track of those applications by numbers
rather than names. The port number makes it easier for the operating system to
know how many applications are using the system and which services are
available.
In theory, port numbers can be assigned on individual machines by the system
administrator, but some conventions have been adopted to allow better
communications. These conventions enable the port number to identify the type of
service that one system is requesting from another. For this reason, most
systems maintain a file of port numbers and their corresponding services.
Port numbers are assigned starting from the number 1. Normally, port numbers
above 255 are reserved for the private use of the local machine, but numbers
between 1 and 255 are used for processes requested by remote applications or for
networking services.
Each network communications circuit into and out of the host computer's TCP
application layer is uniquely identified by a combination of two numbers,
together called the socket. The socket is composed of the IP address of the
machine and the port number used by the TCP software.
Because at least two machines are involved in network communications, there will
be a socket on both the sending and the receiving machine. Because the IP
address of each machine is unique and the port numbers are unique to each
machine, socket numbers are also unique across the network. This setup enables
an application to talk to another application across the network based entirely
on the socket number.
The sending and receiving machines maintain a port table that lists all active
port numbers. The two machines involved have reversed entries for each session
between the two, a process called binding. In other words, if one machine has
the source port number 23 and the destination port number set at 25, the other
machine has its source port number set at 25 and the destination port number set
at 23.
Socket Programming
Linux supports BSD-style socket programming. Both connection-oriented and
connectionless types of sockets are supported. In connection-oriented
communication, the server and client establish a connection before any data is
exchanged. In connectionless communication, data is exchanged as part of a
message. In either case, the server always starts first, binds itself to a
socket, and listens to messages. How the server attempts to listen depends on
the type of connection for which you have programmed it.
You need to know about a few system calls:
socket()
bind()
listen()
accept()
setsockopt() and getsockopt()
connect()
sendto()
recvfrom()
We will cover these system calls in the following examples.
The socket() System Call
The socket() system call creates a socket for the client or the server. The
socket function is defined as shown here:
#include<sys/types.h>
#include<sys/socket.h>
int socket(int family, int type, int protocol)
For Linux, you will have family = AF_UNIX. The type is either SOCK_STREAM for
reliable, though slower, communications or SOCK_DGRAM for faster, but less
reliable, communications. The protocol should be IPPROTO_TCP for SOCK_STREAM and
IPPROTO_UDP for SOCK_DGRAM.
The return value from this function is -1 if there was an error; otherwise, it's
a socket descriptor. You will use this socket descriptor to refer to this socket
in all subsequent calls in your program.
Sockets are created without a name. Clients use the name of the socket to read
or write to it. This is where the bind function comes in.
The bind() System Call
The bind() system call assigns a name to an unnamed socket. The bind function is
defined like this:
#include<sys/types.h>
#include<sys/socket.h>
int bind(int sockfd, struct sockaddr *saddr, int addrlen)
The first item is a socket descriptor. The second is a structure with the name
to use, and the third item is the size of the structure.
Now that you have bound an address for your server or client, you can connect()
to it or listen on it. If your program is a server, it sets itself up to listen
and accept connections. Let's look at the function available for such an
endeavor.
The listen() System Call
The listen() system call is used by the server. It is defined in the following
way:
#include<sys/types.h>
#include<sys/socket.h>
int listen(int sockfd, int backlog);
The sockfd is the descriptor of the socket. The backlog is the number of
connections that are pending at one time before any are rejected. Use the
standard value of 5 for backlog. A returned value of less than 1 indicates an
error.
If this call is successful, you can accept connections.
The accept() System Call
The accept() system call is used by a server to accept any incoming messages
from clients' connect() calls. Be aware that this function does not return if no
connections are received. It is defined like this:
#include<sys/types.h>
#include<sys/socket.h>
int accept(int sockfd, struct sockaddr *peeraddr, int addrlen)
The parameters are the same as those for the bind call, with the exception that
the peeraddr points to information about the client that is making a connection
request. Based on the incoming message, the fields in the structure pointed at
by peeraddr are filled out.
The setsockopt() and getsockopt() System Calls
The socket libraries provided with Linux include a bug. The symptom of this bug
is that you cannot reuse a port number for a socket even if you closed the
socket properly. For example, say you write your own server that waits on a
socket. This server opens the socket and listens on it with no problems.
However, for some reason (a crash or normal termination), when the program is
restarted, you are not able to bind to the same port. The error codes from the
bind() call will always return an error indicating that the port you are
attempting to connect to is already bound to another process.
The problem is that the Linux kernel never marks the port as unused when the
process bound to a socket terminates. In most other UNIX systems, the port can
be used again by another invocation of the same or even another process.
The way to get around this problem in Linux is to use the setsockopt() system
call to set the options on a socket when it is opened and before a connection is
made on it. The setsockopt() sets options and the getsockopt()call gets options
for a given socket.
Here is the syntax for these calls:
#include<sys/types.h>
#include<sys/socket.h>
int getsockopt(int sockfd, int level, int name, char *value, int *optlen)
int setsockopt(int sockfd, int level, int name, char *value, int *optlen)
The sockfd must be an open socket. The level is the protocol level to use for
the function (IPPROTO_TCP for TCP/IP and SOL_SOCKET for socket level options),
and the name of the option is as defined in the socket's man page. The *value
pointer points to a location where a value is stored for getsockopt() or when a
value is read for setsockopt(). The optlen parameter is a pointer to an integer
containing the length of the parameters in bytes; the value is set by
getsockopt() and must be set by the programmer when making a call via
setsockopt().
The full man page with details of all the options is found in the man page
setsockopt(2).
Now back to the bug in Linux. When you open a socket, you must also call the
setsockopt() function with the following segment of code:
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
The #ifdef and #endif statements are necessary only if you want to port the code
over to systems other than Linux. Some UNIX systems might not support or require
the SO_REUSEADDR flag.
The connect() System Call
The connect() system call is used by clients to connect to a server in a
connection-oriented system. This connect() call should be made after the bind()
call. It is defined like this:
#include<sys/types.h>
#include<sys/socket.h>
int connect(int sockfd, struct sockaddr *servsaddr, int addrlen)
The parameters are the same as those for the bind call, with the exception that
the servsaddr points to information about the server that the client is
connecting to. The accept call creates a new socket for the server to work with
the request. This way, the server can fork() off a new process and wait for more
connections. On the server side of things, you would have code that looks like
that shown in Listing 59.1.
Listing 59.1. The server side for a socket-oriented protocol.
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/in.h>
#include <linux/net.h>
#define MY_PORT 6545
main(int argc, char *argv[])
{
int sockfd, newfd;
int cpid; /* child id */
struct sockaddr_in servaddr;
struct sockaddr_in clientInfo;
if ((sockfd = socket(AF_INET, SOCK_STREAM, 0) < 0)
{
myabort("Unable to create socket");
}
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
bzero((char *)&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_family = htons(MY_PORT);
/*
* The htonl (for a long integer) and htons (for short integer) convert
* a host oriented byte order * into a network order.
*/
if (bind(sockfd,(struct sockaddr *)&servaddr,sizeof(struct sockaddr)) < 0)
{
myabort("Unable to bind socket");
}
listen(sockfd,5);
for (;;)
{
/* wait here */
newfd=accept(sockfd,(struct sockaddr *)&clientInfo,
sizeof(struct sockaddr);
if (newfd < 0)
{
myabort("Unable to accept on socket");
}
if ((cpid = fork()) < 0)
{
myabort("Unable to fork on accept");
}
else if (cpid == 0) { /* child */
close(sockfd); /* no need for original */
do_your_thing(newfd);
exit(0);
}
close(newfd); /* in the parent */
}
}
In the case of connection-oriented protocols, the server performs the following
functions:
Creates a socket with a call to the socket() function
Binds itself to an address with the bind() function call
Listens for connections with the listen() function call
Accepts any incoming requests with the accept() function call
Gets incoming messages with the read() function and replies with the write()
call
Now let's look at the client side of things, in Listing 59.2.
Listing 59.2. The client side function.
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/in.h>
#include <linux/net.h>
#define MY_PORT 6545
#define MY_HOST_ADDR "204.25.13.1"
int getServerSocketId()
{
int fd, len;
struct sockaddr_in unix_addr;
/* create a Unix domain stream socket */
if ( (fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0)
{
return(-1);
}
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
/* fill socket address structure w/our address */
memset(&unix_addr, 0, sizeof(unix_addr));
unix_addr.sin_family = AF_INET;
/* convert internet address to binary value*/
unix_addr.sin_addr.s_addr = inet_addr(MY_HOST_ADDR);
unix_addr.sin_family = htons(MY_PORT);
if (bind(fd, (struct sockaddr *) &unix_addr, len) < 0)
return(-2);
memset(&unix_addr, 0, sizeof(unix_addr));
if (connect(fd, (struct sockaddr *) &unix_addr, len) < 0)
return(-3);
return(fd);
}
The client for connection-oriented communication also takes the following steps:
Creates a socket with a call to the socket() function
Attempts to connect to the server with a connect() call
If a connection is made, requests data with the write() call, and reads
incoming replies with the read() function
Connectionless Socket Programming
Now let's consider the case of a connectionless exchange of information. The
principle on the server side is different from the connection-oriented server
side in that the server calls recvfrom() rather than the listen and accept
calls. Also, to reply to messages, the server uses the sendto() function call.
See Listing 59.3 for the server side.
Listing 59.3. The server side.
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/in.h>
#include <linux/net.h>
#define MY_PORT 6545
#define MAXM 4096
char mesg[MAXM];
main(int argc, char *argv[])
{
int sockfd, newfd;
int cpid; /* child id */
struct sockaddr_in servaddr;
struct sockaddr_in clientInfo;
if ((sockfd = socket(AF_INET, SOCK_STREAM, 0) < 0)
{
myabort("Unable to create socket");
}
#ifdef LINUX
opt = 1; len = sizeof(opt);
setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&opt,&len);
#endif
bzero((char *)&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_family = htons(MY_PORT);
/*
* The htonl (for a long integer) and htons (for short integer) convert
* a host oriented byte order * into a network order.
*/
if (bind(sockfd,(struct sockaddr *)&servaddr,sizeof(struct sockaddr)) < 0)
{
myabort("Unable to bind socket");
}
for (;;)
{
/* wait here */
n = recvfrom(sockfd, mesg, MAXM, 0,
(struct sockaddr *)&clientInfo,
sizeof(struct sockaddr));
doSomethingToIt(mesg);
sendto(sockfd,mesg,n,0,
(struct sockaddr *)&clientInfo,
sizeof(struct sockaddr));
}
}
As you can see, the two function calls to process each message make this an
easier implementation than a connection-oriented one. You must, however, process
each message one at a time because messages from multiple clients can be
multiplexed together. In a connection-oriented scheme, the child process always
knows where each message originated.
The client does not have to call the connect() system call either. Instead, the
client can call the sendto() function directly. The client side is identical to
the server side, with the exception that the sendto call is made before the
recvfrom()call:
#include <sys/types.h>
#include <sys/socket.h>
int sendto((int sockfd,
const void *message__, /* the pointer to message */
int length, /* of message */
unsigned int type, /* of routing, leave 0 */
const struct sockaddr * client, /* where to send it */
int length ); /* of sockaddr */
NOTE: If you are a BSD user, use the sendto() call, not the sendmsg() call.
The sendto() call is more efficient.
Any errors are indicated by a return value of -1. Only local errors are
detected.
The recvfrom() system call is defined as shown here:
#include <sys/types.h>
#include <sys/socket.h>
int recvfrom(int sockfd,
const void *message__, /* the pointer to message */
int length, /* of message */
unsigned int flags, /* of routing, leave 0 */
const struct sockaddr * client, /* where to send it */
int length ); /* of sockaddr */
If a message is too long to fit in the supplied buffer, the extra bytes are
discarded. The call might return immediately or wait forever, depending on the
type of the flag being set. You can even set timeout values. Check the man pages
for recvfrom for more information.
There you have it--the very basics of how to program applications to take
advantage of the networking capabilities under Linux. We have not even scratched
the surface of all the intricacies of programming for networks. A good starting
point for more detailed information would be UNIX Network Programming, by W.
Richard Stevens (Prentice Hall, 1990). This book, a classic, is used in most
universities and is by far the most detailed book to date.
Record and File Locking
When two processes want to share a file, the danger exists that one process
might affect the contents of the file, and thereby affect the other process. For
this reason, most operating systems use a mutually exclusive principle: when one
process has a file open, no other process can touch it. This is called file
locking.
This technique is simple to implement. What usually happens is that a "lock
file" is created with the same name as the original file but with the extension
.lock, which tells other processes that the file is unavailable. This is how
many Linux spoolers, such as the print system and UUCP, implement file locking.
It is a brute-force method, perhaps, but effective and easy to program.
Unfortunately, this technique is not good when you must have several processes
access the same information quickly, because the delays waiting for file opening
and closing can grow to be appreciable. Also, if one process doesn't release the
file properly, other processes can hang there, waiting for access.
For this reason, record locking is sometimes implemented. With record locking, a
single part of a larger file is locked to prevent two processes from changing
its contents at the same time. Record locking enables many processes to access
the same file at the same time, each updating different records within the file,
if necessary. The programming necessary to implement record locking is more
complex than that for file locking, of course.
Normally, to implement record locking, you use a file offset, or the number of
characters from the beginning of the file. In most cases, a range of characters
is locked; the program has to note the start of the locking region and the
length of it, and then store that information where other processes can examine
it.
Writing either file-locking or record-locking code requires a good understanding
of the operating system but is otherwise not difficult, especially because
thousands of programs are readily available from the Internet, in networking
programming books, and on BBSs to examine for sample code.
Interprocess Communications
Network programming always involves two or more processes talking to each other
(interprocess communications), so the way in which processes communicate is
vitally important to network programmers. Network programming differs from the
usual method of programming in a few important aspects. A traditional program
can talk to different modules (or even other applications on the same machine)
through global variables and function calls. That doesn't work across networks.
A key goal of network programming is to ensure that processes don't interfere
with each other. Otherwise, systems can get bogged down or can lock up.
Therefore, processes must have a clean and efficient method of communicating.
UNIX is particularly strong in this regard, because many of the basic UNIX
capabilities, such as pipes and queues, are used effectively across networks.
Writing code for interprocess communications is quite difficult compared to
single application coding. If you want to write this type of routine, you should
study sample programs from a network programming book or a BBS site to see how
this task is accomplished.
Summary
Few people need to write network applications, so the details of the process are
best left to those who want them. Experience and lots of examples are the best
way to begin writing network code, and mastering the skills can take many years.