著者:Ori Pomerantz
翻译:徐辉
4.使用/proc进行输入
现在我们已经有了两种方法从内核模块中产生输出:注册一个设备驱动并且mknod一个设备文件,或者创建一个/proc文件。这可以使内核告诉我们任何信息。现在的问题是我们没有办法回答给内核。我们象内核输入的第一种方法是写给/proc文件。
因为proc文件系统主要是为满足内核向进程报告其状态的,没有为输入留出特别的规定。数据结构proc_dir_entry没有包含一个指向某个输入函数的指针,就象指向输出函数那样。如果我们要向一个/proc文件写入,我们需要使用标准文件系统机制。
在Linux里有一个文件系统注册的标准机制。每个文件系统都有自己的函数来处理索引节点和文件操作,所以就有一个特殊的机构来存放指向所有函数的指针,struct inode_operations,它有一个指向struct file_operations的指针。在/proc里,无论何时我们注册一个新文件,我们就被允许指定用inod_operations访问哪个结构。这就是我们要用的机制,一个inode_operations,包括一个指向file_operations的指针,file_operations里包含我们的module_input和module_output函数。
必须指出标准的读写角色在内核中被倒置了,读函数用来输出,而写函数用来输入。这是因为读和写是在用户的观点看,如果一个进程从内核中读取一些内容,那么内核就必须输出处理。而进程要写入内核,内核就要接受输入。
另一个有趣的地方是module_permission函数。这个函数每当进程试图对/proc文件进行处理时调用,它可以决定是否允许访问。目前这个函数只定义在操作和当前使用的uid(当前可用的是一个指针指向一个当前运行进程的信息的结构)的基础上,但是它可以在我们希望的任何事物的基础上定义,比如其他进程正在对文件做的操作,日期时间或者接收到的最后一个输入。
使用put_usr和get_user的原因是Linux的内存是分段的(在Intel结构下,其他系列的处理器下可能不同)。这意味着一个指针本身不代表内存中的一个唯一地址,而是段中的一个地址,所以你还需要知道哪一个段可以使用它。内核占有一个段,每个进程都各占有一个段。
一个进程可以访问的唯一的段就是它自己拥有的那个,所以当你写作为进程运行的程序时可以不用关心段的问题。如果你要写内核模块,一般你希望访问内核的段,这由系统自动处理。然而,如果内存缓冲区的内容需要在当前运行的进程和内核之间传递时,内核函数会接到在此进程段里的指向内存缓冲区的一个指针。Put_user和get_user允许你访问那块内存。
ex procfs.c
/* procfs.c - create a "file" in /proc, which allows
* both input and output. */
/* Copyright (C) 1998-1999 by Ori Pomerantz */
/* The necessary header files */
/* Standard in kernel modules */
#include /* Were doing kernel work */
#include /* Specifically, a module */
/* Deal with CONFIG_MODVERSIONS */
#if CONFIG_MODVERSIONS==1
#define MODVERSIONS
#include
#endif
/* Necessary because we use proc fs */
#include
/* In 2.2.3 /usr/include/linux/version.h includes a
* macro for this, but 2.0.35 doesnt - so I add it
* here if necessary. */
#ifndef KERNEL_VERSION
#define KERNEL_VERSION(a,b,c) ((a)*65536+(b)*256+(c))
#endif
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
#include /* for get_user and put_user */
#endif
/* The modules file functions ********************** */
/* Here we keep the last message received, to prove
* that we can process our input */
#define MESSAGE_LENGTH 80
static char Message[MESSAGE_LENGTH];
/* Since we use the file operations struct, we cant
* use the special proc output provisions - we have to
* use a standard read function, which is this function */
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
static ssize_t module_output(
struct file *file, /* The file read */
char *buf, /* The buffer to put data to (in the
* user segment) */
size_t len, /* The length of the buffer */
loff_t *offset) /* Offset in the file - ignore */
#else
static int module_output(
struct inode *inode, /* The inode read */
struct file *file, /* The file read */
char *buf, /* The buffer to put data to (in the
* user segment) */
int len) /* The length of the buffer */
#endif
{
static int finished = 0;
int i;
char message[MESSAGE_LENGTH+30];
/* We return 0 to indicate end of file, that we have
* no more information. Otherwise, processes will
* continue to read from us in an endless loop. */
if (finished) {
finished = 0;
return 0;
}
/* We use put_user to copy the string from the kernels
* memory segment to the memory segment of the process
* that called us. get_user, BTW, is
* used for the reverse. */
sprintf(message, "Last input:%s", Message);
for(i=0; i
put_user(message[i], buf+i);
/* Notice, we assume here that the size of the message
* is below len, or it will be received cut. In a real
* life situation, if the size of the message is less
* than len then wed return len and on the second call
* start filling the buffer with the len+1th byte of
* the message. */
finished = 1;
return i; /* Return the number of bytes "read" */
}
/* This function receives input from the user when the
* user writes to the /proc file. */
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
static ssize_t module_input(
struct file *file, /* The file itself */
const char *buf, /* The buffer with input */
size_t length, /* The buffers length */
loff_t *offset) /* offset to file - ignore */
#else
static int module_input(
struct inode *inode, /* The files inode */
struct file *file, /* The file itself */
const char *buf, /* The buffer with the input */
int length) /* The buffers length */
#endif
{
int i;
/* Put the input into Message, where module_output
* will later be able to use it */
for(i=0; i
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
get_user(Message[i], buf+i);
/* In version 2.2 the semantics of get_user changed,
* it not longer returns a character, but expects a
* variable to fill up as its first argument and a
* user segment pointer to fill it from as the its
* second.
*
* The reason for this change is that the version 2.2
* get_user can also read an short or an int. The way
* it knows the type of the variable it should read
* is by using sizeof, and for that it needs the
* variable itself.
*/
#else
Message[i] = get_user(buf+i);
#endif
Message[i] = \; /* we want a standard, zero
* terminated string */
/* We need to return the number of input characters
* used */
return i;
}
/* This function decides whether to allow an operation
* (return zero) or not allow it (return a non-zero
* which indicates why it is not allowed).
*
* The operation can be one of the following values:
* 0 - Execute (run the "file" - meaningless in our case)
* 2 - Write (input to the kernel module)
* 4 - Read (output from the kernel module)
*
* This is the real function that checks file
* permissions. The permissions returned by ls -l are
* for referece only, and can be overridden here.
*/
static int module_permission(struct inode *inode, int op)
{
/* We allow everybody to read from our module, but
* only root (uid 0) may write to it */
if (op == 4 || (op == 2 && current->euid == 0))
return 0;
/* If its anything else, access is denied */
return -EACCES;
}
/* The file is opened - we dont really care about
* that, but it does mean we need to increment the
* modules reference count. */
int module_open(struct inode *inode, struct file *file)
{
MOD_INC_USE_COUNT;
return 0;
}
/* The file is closed - again, interesting only because
* of the reference count. */
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
int module_close(struct inode *inode, struct file *file)
#else
void module_close(struct inode *inode, struct file *file)
#endif
{
MOD_DEC_USE_COUNT;
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
return 0; /* success */
#endif
}
/* Structures to register as the /proc file, with
* pointers to all the relevant functions. ********** */
/* File operations for our proc file. This is where we
* place pointers to all the functions called when
* somebody tries to do something to our file. NULL
* means we dont want to deal with something. */
static struct file_operations File_Ops_4_Our_Proc_File =
{
NULL, /* lseek */
module_output, /* "read" from the file */
module_input, /* "write" to the file */
NULL, /* readdir */
NULL, /* select */
NULL, /* ioctl */
NULL, /* mmap */
module_open, /* Somebody opened the file */
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,2,0)
NULL, /* flush, added here in version 2.2 */
#endif
module_close, /* Somebody closed the file */
/* etc. etc. etc. (they are all given in
* /usr/include/linux/fs.h). Since we dont put
* anything here, the system will keep the default
* data, which in Unix is zeros (NULLs when taken as
* pointers). */
};
/* Inode operations for our proc file. We need it so
* well have some place to specify the file operations
* structure we want to use, and the function we use for
* permissions. Its also possible to specify functions
* to be called for anything else which could be done to
* an inode (although we dont bother, we just put
* NULL). */
static struct inode_operations Inode_Ops_4_Our_Proc_File =
{
&File_Ops_4_Our_Proc_File,
NULL, /* create */
NULL, /* lookup */
NULL, /* link */
NULL, /* unlink */
NULL, /* symlink */
NULL, /* mkdir */
NULL, /* rmdir */
NULL, /* mknod */
NULL, /* rename */
NULL, /* readlink */
NULL, /* follow_link */
NULL, /* readpage */
NULL, /* writepage */
NULL, /* bmap */
NULL, /* truncate */
module_permission /* check for permissions */
};
/* Directory entry */
static struct proc_dir_entry Our_Proc_File =
{
0, /* Inode number - ignore, it will be filled by
* proc_register[_dynamic] */
7, /* Length of the file name */
"rw_test", /* The file name */
S_IFREG | S_IRUGO | S_IWUSR,
/* File mode - this is a regular file which
* can be read by its owner, its group, and everybody
* else. Also, its owner can write to it.
*
* Actually, this field is just for reference, its
* module_permission that does the actual check. It
* could use this field, but in our implementation it
* doesnt, for simplicity. */
1, /* Number of links (d