Perl Web 服务器初级编程 - 王朝网络宽屏版

如果你对WEB服务器编程和LINUX环境下PERL CGI编程感兴趣，建议你阅读下面内容，即使没有CGI开发经验，只要了解perl语言，就不会感觉困难。由于我第1次写作Blog文章，如有错误请指出，谢谢。

HTTP消息分为2种，客户机发送给服务器的HTTP请求以及服务器回送给客户机的HTTP响应.

这2者都由一个必不可少的头(由一些要求的和许多可选的头行组成) + 一个可选的主体(要传输的文档) 构成.以下是获取头和主体的各种方法。

GET:从服务器获取头和主体

GET /index.html.HTTP/1.1

Host:www.myserver.com

HEAD:只获得资源头

HEAD /index.html.HTTP/1.1

Host:www.myserver.com

PUT:在HTTP主体中发送信息给服务器

PUT /doc/newdoc.html HTTP/1.1

Host:www.myserver.com

Content-length:2043

<html>

...........................

POST:在HTTP主体中发送信息给服务器

POST /index.html.HTTP/1.1

Host:www.myserver.com

Content-Length:23

Roses=red&violets=blue

TRACE:跟踪一个HTTP请求---响应交换，不是一个用于正常请求的方法

TRACE * HTTP/1.1

Host:www.myserver.com

DELETE:出于安全考虑，一般服务器不支持此方法

DELETE:/doc/olddoc.html HTTP/1.1

host:www.myserver.com

POST用于发送CGI脚本这样的服务器应用程序将处理，以创建资源(例如在服务器上的进程，为每个用户创建进程也成为人们指责CGI的重要原因)的内容，URL指明了数据发送的应用程序，比如如果表单数据超过了256个字符，将不能用GET方法，而只能使用POST。

PUT指示正在创建一个由URL描述的新资源。URL描述了新资源，并且可以同随后的GET请求一起来检索它。

HTTP响应: HTTP状态行(HTTP协议，响应码) + 描述响应类型的3位数 + 响应的文本描述

Eg: HTTP/1.1 200 OK

Eg: HTTP/1.1 404 NOT FOUND

HTTP头:任意，客户与服务器使用头相互通信 (#头第1字母大写，：后需有空格)

GET /~unixdb/test.html HTTP1.1 #状态行

Connection: Keep-Alive

User-Agent: Mozilla/4.75 [en] (X11;u;Linux 2.2.17 i686)

Host: www.myserver.com

Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, text/html, */*

Accept-Encoding: gzip

Accept-Language: en

Accept-Charset: iso-8859-1,*,utf-8

这样，一个典型的GET请求的服务器响应:

HTTP1/1 200 OK #HTTP status

Date: Tue,7 Sep 2004 23:35 GMT #HTTP Header

Server: Apache /1.3.12(Unix) mod_ssl/2.6.4 OpenSSL/0.9.5a

Connection: close

Content-Type: text/html

<Title>testing</Title>

testing

HTTP主体：PUT和POST方法需要主体，非可选。在Perl中，主体与头使用\n\n(正式分割符序列是\012\015\012\015) 分开

Eg:

my $request=<>;

my (@headers,$body);

while(<>)

{

push @headers,$_ if 1../^$/;

$body.=$_ if /^$/..eof; #means $body=$body . $_;

}

对照:CGI响应是没有状态行的HTTP响应，被要求发送的头有一个是Content-Type头，后面跟着一个介质类型，以定义主体内容。

Eg1: print “Content-Type: text/plain\n\n“;

Eg2: print “Content-Type: image/jpeg\n\n”;

CGI环境变量我们可以参考

my $docroot=$ENV {'DOCUMENT_ROOT'};

例如我们可以这样查看我们的环境变量：

#!/usr/bin/perl -w

#author:nick

#goal:check the ENV

#env.cgi

use strict;

print "content-type: text/html \n\n";

print "<html><title>Enviroment Variables</title><head>\n";

print "Here is your Enviroment Variables</head>\n" ;

print "<body><blockquote><table border=1>\n";

foreach (sort keys %ENV)

{

print "\t<tr><td> $_ </td><td>$ENV{$_} </td></tr>\n";

}

print "</table></blockquote></body></html>\n";

#run the program

#you can see:

#CONTENT-LENGTH: the length of HTTP requirement,here it should display 0,for we

#did not put or post mainbody here

#GATEWAY-INTERFACE: cgi protocol and version

#HTTP-REFERER: URI of resource

#HTTP_USER_AGENT: client's software,maybe IE or Netscape

#PATH: client's additional path

#QUERY_STRING: requesting URI's query-string, maybe null or important for CGI

#programs

#REMOTE_ADDR: client's IP

#REMOTE_HOST: client's host name

#REQUEST_METHOD : the HTTP method of request,maybe "get" or "post"

#SCRIPT_FILENAME: script's File Path

#SCRIPT_NAME: script's URL

#SERVER_NAME :server's host name

#SERVER_SOFTWARE: web server listening request

Perl 创建HTTP头和HTML头的过程:

面向对象: print $cgi->header (print $cgi->header('image/jpeg')) print $cgi->start_html(“my cgipage“);

面向过程: print header;(print header('image/jpeg')) print start_html(“my cgipage“);

高级头:

print $cgi->header(-status=>'200 Nistal',

-type=>'text/html',

-expires=>'+30s',

-nph=>1,

);

产生的头如下:

HTTP/1.0 200 Nistal

Status:200 Nistal

Expires:Mon 25 Dec 2005 4:35 GMT

Date:Mon 25 Dec 2005 4:15 GMT

Content-Type: text/html; charset=ISO-8859-1

参数列表:

-status 响应代码和消息

-expires截止时间或日期

-nph 切换到非解析头模式脚本输出产生时被直接发送给客户机，服务器不干预

为了使输出非缓冲，我们可以设置$|=1来激活autoflush模式；也可以把nph作为导入标记来传递激活nph模式

use CGI qw(:standard'nph');

高级Document头:

#! /usr/bin/perl

#headertag.cgi

use warnings;

use strict;

#import invented 'link' and 'myheadertag' elements

use CGI qw(:standard link myheadertag);

print header;

print start_html(

-title=>'Big Document Header',

-author=>'cloudfordnick@yahoo.com.cn',

-xbase=>'http://www.myserver.com', #文档的基本URL

-target=>'my_panel', #文档目标桢

-meta=>{ #指向元标记的名字/值对的一个哈希引用

description=>'How to define a CGI header with Metatags',

keywords=>'meta,metadata,cgi,tags,html,perl',

-style=>{

src=>'/css/mystylesheet.css'

};

CGI模块自动为我们处理URL大部分转义和非转义字符

print “<a href=“,$cgi->escape($unescaped_url).'?'.$cgi->escape($key).'='.$cgi->escape($value),“>Link</a>“;

URL方法仅仅返回脚本的URL

$cgi->url(-full=>1) # http://myserver/path/script/nistal

$cgi->url(-absolute=>1) #absolute path /home/sites/cgi/script/nistal

$cgi->url(-relative=>1) #relative URL /cgi/script/nistal

保存和加载CGI状态

保存: if(open(STATE , “> $state“))

{

$cgi->save(STATE);

close STATE;

}

加载: if(open(STATE,$state))

{

$cgi->new CGI(STATE);

close STATE;

}

以下是简单的CGI脚本:

1.一个简单的Web服务器,为由$docroot声明的跟目录外的页面服务。它仅仅处理GET请求，并且如果他们同声明的URL相匹配的话将文档回送给客户机

#! /usr/local/perl -w

#httpd.pl

use warnings;

use HTTP::Daemon;

use HTTP::Status; #for RC_FORBIDDEN

my $docroot= “/home/httpd/html“;

my $httpd= new HTTP::Daemon;

print “Server running at :“,$httpd->url(),“\n“;

while(my $connection=$httpd->accept)

{

while(my $request=$connection->get_request)

{

if($request->method eq 'GET')

{

my $file=$request->url->path;

$connection->send_file_response(“/$docroot/$file“);

}

else{

$connection->send_error(RC_FORBIDDEN);

}

$connection->close;

}

undef($connection);

}

HTTP::Daemon对象从IO::Socket::INET模块继承，我们可以在其上进行SOCKET操作。daemon对象扫描本地主机以获取一个可能的名字，并选择一个端口号服务 Eg:在Unix服务器上

Server running at: http://localhost.localdomain:1640/

通过浏览器指向服务器发送简单请求后，创建Daemon对象，并由其接收呼叫，等待连接，当客户机连接后，返回一个连接对象。检索将GET请求翻译成一个路径的URI，最后我们在连接对象上调用send_file_response将请求文档回送客户机，if not found,发送404 Not Found响应，if 是目录，发送501 Not Implemented错误。

列举HTTPD::Daemon的方法：

1.new 创建新服务器

Eg: $httpd=new HTTPD:: Daemon(

LocalAddr=>'www.myserver.com',

Localport=>80,

);

2.$httpd->accept 接受来自客户机连接请求，返回HTTP::Daemon::ClientCoon对象

3.$httpd->url 由后台程序处理的主机和端口名 http://server:port/

4.$conn->get_request 读取来自客户机的HTTP请求并返回一个HTTP::request对象。能接受块传输和使用multipart/form-data编码的文档上载。一旦头被读取就让get_request返回，然后就可以使用read_buffer来成块读数据。

5.$conn->read_buffer

6.$conn->reason 将get_request失败的原因返回

7.$conn->send_status_line 发送一个HTTP响应状态行给客户机

Eg1: $conn->send_status_line(RC_NOT_FOUND);

#generate standard '404' response

Eg2: $conn->send_status_line(404,“It wasn't there!“);

#Eg2 has the same effect as Eg1 has

8.$conn->send_file_response 试图打开，读和发送文档内容给客户机

9.$conn->send_file 试图发送传递的文档句柄内容给客户机

2.一个简单的服务器推计数器

服务器推:不断的用新的信息来更新客户机

#! /usr/bin/perl -w

#nph-push.cgi

#use strict

use CGI::Push qw(:standard);

my $cgi=new CGI::Push ;

$cgi->do_push(-next_page=> \$refresh);

sub refresh{

my ($cgi,$count) = @_; #passed in by CGI::Push

my $page=start_html(“CGI PUSH DEMO“)

.p(“The count is $count“)

.end_html;

returen $page;

}

CGI::PUSH应用程序使用do_push方法注册一个子程序，调用该子程序以在每一次循环产生页面内容。CGI::PUSH跟踪记数并传递给子程序

如果do-push 注册的子程序回送一个未定义值作为其结果来选择结束更新和终止HTTP响应。

#! /usr/bin/perl -w

#nph-pushlast.cgi

use warnings;

use strict;

use CGI::Push;

my $cgi=new CGI::Push;

$cgi->do_push(-next_page=> \$refresh,-last_page=>\$done );

sub refresh{

my ($cgi,$count)=@_; #passed in by CGI::Push

return undef if $count==10;

return $cgi->start_html,$cgi->p(“The count is $count“),$cgi->end_html;

}

sub done{

my ($cgi,$count)=@_;

return $cgi->start_html,“Count stopped on $count“,$cgi->end_html;

}

模块在内部把这个列表发送给print，所以print将接受的任何东西是-next_page或-last_page子程序的一个正确返回值