如何用 PHP把RDF内容插入Web站点之中(4)

返回到类（Back To Class）

既然你有这么大的权力，那么究竟为什么要把自己限制在仅仅是单个的ＲＤＦ来源呢？就象我早先说过的一样，大多数主要的站点都经常为他们所提供的内容做快照。其实将所有这些不同的来源插入到你的站点当中是相当简单的。让我们看看是如何做的。

首先，我们把前面例子中的代码模块化。这样一来，你就无须为每一个单个的来源都一遍又一遍的重写相同的代码了。简化的方法就是将之打包成类，再把这个类包含到我的ＰＨＰ脚本当中。

类代码如下：

class RDFParser

{

// variables

// set up local variables for this class

var $currentTag = "";

var $flag = "";

var $count = 0;

// this is an associative array of channel data with keys

("title", "link", "description")

var $channel = array();

// this is an array of arrays, with each array element

representing an <item>

// each outer array element is itself an associative array

// with keys ("title", "link", "description")

var $items = array();

// methods

// set the name of the RDF file to parse

// this is usually a local file

// you may set it to a remote file if your PHP build supports

URL fopen()

function setResource($file)

{

$this->file = $file;

}

// parse the RDF file set with setResource()

// this populates the $channel and $items arrays

function parseResource()

{

// create parser

$this->xp = xml_parser_create();

// set object reference

xml_set_object($this->xp, $this);

// set handlers and parser options

xml_set_element_handler($this->xp, "elementBegin",

"elementEnd");

xml_set_character_data_handler($this->xp,

"characterData");

xml_parser_set_option($this->xp,

XML_OPTION_CASE_FOLDING, TRUE);

xml_parser_set_option($this->xp, XML_OPTION_SKIP_WHITE,

TRUE);

// read XML file

if (!($fp = fopen($this->file, "r")))

{

die("Could not read $this->file");

}

// parse data

while ($xml = fread($fp, 4096))

{

if (!xml_parse($this->xp, $xml, feof($fp)))

{

die("XML parser error: " .

xml_error_string(xml_get_error_code($this->xp)));

}

// destroy parser

xml_parser_free($this->xp);

}

// opening tag handler

function elementBegin($parser, $name, $attributes)

{

$this->currentTag = $name;

// set flag if entering <channel> or <item> block

if ($name == "ITEM")

{

$this->flag = 1;

}

else if ($name == "CHANNEL")

{

$this->flag = 2;

}

// closing tag handler

function elementEnd($parser, $name)

{

$this->currentTag = "";

// set flag if exiting <channel> or <item> block

if ($name == "ITEM")

{

$this->count++;

$this->flag = 0;

}

else if ($name == "CHANNEL")

{

$this->flag = 0;

}

// character data handler

function characterData($parser, $data)

{

$data = trim(htmlspecialchars($data));

if ($this->currentTag == "TITLE" || $this->currentTag ==

"LINK" || $this->currentTag == "DESCRIPTION")

{

// add data to $channels[] or $items[] array

if ($this->flag == 1)

{

$this->items[$this->count][strtolower($this->currentTag)] .= $data;

}

else if ($this->flag == 2)

{

$this->channel[strtolower($this->currentTag)] .= $data;

}

// return an associative array containing channel information

// (the $channel[] array)

function getChannelInfo()

{

return $this->channel;

}

// return an associative array of arrays containing item

information

// (the $items[] array)

function getItems()

{

return $this->items;

}

如果你对ＰＨＰ类较为熟悉的话，那么理解这段代码是相当容易的。如果不太懂的话，那么请直接跳到文章末尾的链接部分，看一篇关于类工作原理的好文章。然后在回来继续阅读上面的代码。

在使用这个类之前，我要特别花几分钟指出其中的一行代码——即上面对xml_set_object()函数调用的那一行。

现在的问题是如何使用这个类实际生成具有多个内容来源的Web页。

include("class.RDFParser.php");

// how many items to display in each channel

$maxItems = 5;

<html>

<head>

<body>

// get and parse freshmeat.net channel

$f = new RDFParser();

$f->setResource("http://www.freshmeat.net/backend/fm-releases.rdf");

$f->parseResource();

$f_channel = $f->getChannelInfo();

$f_items = $f->getItems();

// now format and print it...

The latest from <a href=<? echo $f_channel["link"]; ?>><? echo

$f_channel["title"]; ?></a> <br> <ul> <? // iterate through items array

for ($x=0; $x<$maxItems; $x++) {

if (is_array($f_items[$x]))

{

// print data

$item = $f_items[$x];

echo "<li><a href=" . $item["link"] . ">" .

$item["title"] . "</a>";

}

</ul>

</font>

</td>

<i>Primary page content here</i>

</td>

// get and parse slashdot.org channel

$s = new RDFParser();

$s->setResource("http://slashdot.org/slashdot.rdf");

$s->parseResource();

$s_channel = $s->getChannelInfo();

$s_items = $s->getItems();

// now format and print it...

The latest from <a href=<? echo $s_channel["link"]; ?>><? echo

$s_channel["title"]; ?></a> <br> <ul> <? // iterate through items array

for ($x=0; $x<$maxItems; $x++) {

if (is_array($s_items[$x]))

{

// print data

$item = $s_items[$x];

echo "<li><a href=" . $item["link"] . ">" .

$item["title"] . "</a>";

}

</ul>

</font>

</td>

</tr>

</table>

</body>

</head>

</html>

这段代码相当简单。一旦你用“new”关键字生成一个类的实例，

$f = new RDFParser();

那么就可以用类方法来设置要分析的RDF文件的位置，

$f->setResource("http://www.freshmeat.net/backend/fm-releases.rdf");

$f->parseResource();

并且获取$channel和$items数组，以用于后面的处理。

$f_channel = $f->getChannelInfo();

$f_items = $f->getItems();

The latest from <a href=<? echo $f_channel["link"]; ?>><? echo

$f_channel["title"]; ?></a> <br> <ul> <? // iterate through items array

for ($x=0; $x<$maxItems; $x++) {

if (is_array($f_items[$x]))

{

// print data

$item = $f_items[$x];

echo "<li><a href=" . $item["link"] . ">" .

$item["title"] . "</a>";

}

</ul>

每次你重新装入上面的脚本，相应的RDF文件就会被从特定的位置上取来，经过分析之后，按要求的格式显示出来。

如果你站点具有高的访问量，你就可能觉得我们的辛苦无意义之极，尤其是当所用的RDF数据更新地没有那么快时，情况更糟。在这种情况下，或许探究一下在本地缓存RDF数据才是较明智的做法：要么扩展上面的例子程序，在其中加入缓存功能；要么每阁几个小时都花很长的时间下载一个最新RDF文件的本地副本到你的Web服务器上，然后使用这个本地副本，而不是那个“活”的（the “live” one）。