I/O、内存、网络性能优化命令 iostat , vmstat,netstat

是在sun网站上看到的，原文是e文，连接如下：

[url=http://www.pccode.net]_netstat.htm"http://www.adminschoice.com/docs/iostat_vmstat_netstat.htm

大概翻译了一下，总觉得好像以前哪位大大贴过类似的

不管了，觉得有点用，大家一起看看

翻译得不好，见笑了……如果有错，请指正哦，先谢了

Input Output statistics ( iostat )

iostat反映了终端、磁盘I/O情况和CPU活动。输出结果的第一行是从系统启动到现在为止的这段时间的结果，接下去的每一行是interval时

间段内的结果。Kernel里有一组计数器用来跟踪这些值。

iostat的默认参数是tdc(terminal, disk, and CPU)。如果任何其他的选项被指定，这个默认参数将被完全替代，例如，iostat -d将只反

映磁盘的统计结果。

语法:

基本语法： iostat <options> interval count

option - 让你指定所需信息的设备，像磁盘、cpu或者终端(-d , -c , -t or -tdc ) 。x 选项给出了完整的统计结果（gives the extended

statistic）。

interval - 在两个samples之间的时间（秒）。

count - 就是需要统计几次

例子：

$ iostat -xtc 5 2

extended disk statistics tty cpu

disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id

sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0

sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23

sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0

sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31

The fields have the following meanings:

disk name of the disk

r/s reads per second

w/s writes per second

Kr/s kilobytes read per second

Kw/s kilobytes written per second

wait average number of transactions waiting for service (Q length)

actv average number of transactions actively

being serviced (removed from the

queue but not yet

completed)

%w percent of time there are transactions waiting

for service (queue non-empty)

%b percent of time the disk is busy (transactions

in progress)

Results and Solutions:

从iostat输出结果中需要注意的值：

Reads/writes per second (r/s , w/s)

Percentage busy (%b)

Service time (svc_t)

如果磁盘显示长时间的高reads/writes，并且磁盘的percentage busy (%b)也远大于5%，同时average service time (svc_t)也远大于30

milliseconds，这以下的措施需要被执行：

1.)调整应用，令其使用磁盘i/o更加有效率，可以通过修改磁盘队列、使用应用服务器的cache

2.)将文件系统分布到2个或多个磁盘上，并使用volume manager/disksuite的条带化特点

3.) 增加系统参数值，如inode cache , ufs_ninode。Increase the system parameter values for inode cache , ufs_ninode , which

is Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis

4.) 将文件系统移到更快的磁盘/控制器，或者用更好的设备来代替

还有两段等会翻好了再贴……

Virtual Memory Statistics ( vmstat )

vmstat - vmstat反映了进程的虚拟内存、虚拟内存、磁盘、trap(是不是翻译成中断？？)和cpu的活动情况

在多cpu系统中，vmstat在输出结果中平均了cpu数量。For per-process statistics .如果没有选项，vmstat显示一行虚拟内存活动的概要

结果，是从系统启动时开始的。

语法:

vmstat <options> interval count

option - 让你指定所需的信息类型，例如 paging -p , cache -c ,.interrupt -i etc.

如果没有指定选项，将会显示进程、内存、页、磁盘、中断和cpu信息

interval - 同iostat

count - 同iostat

例子

The following command displays a summary of what the system

is doing every five seconds.

example% vmstat 5

procs memory page disk faults cpu

r b w swap free re mf pi p fr de sr s0 s1 s2 s3 in sy cs us sy id

0 0 0 11456 4120 1 41 19 1 3 0 2 0 4 0 0 48 112 130 4 14 82

0 0 1 10132 4280 0 4 44 0 0 0 0 0 23 0 0 211 230 144 3 35 62

0 0 1 10132 4616 0 0 20 0 0 0 0 0 19 0 0 150 172 146 3 33 64

0 0 1 10132 5292 0 0 9 0 0 0 0 0 21 0 0 165 105 130 1 21 78

The fields of vmstat's display are

procs

r in run queue

b blocked for resources I/O, paging etc.

w swapped

memory (in Kbytes)

swap - amount of swap space currently available

free - size of the free list

page ( in units per second).

re page reclaims - see -S option for how this field is modified.

mf minor faults - see -S option for how this field is modified.

pi kilobytes paged in

po kilobytes paged out

fr kilobytes freed

de anticipated short-term memory shortfall (Kbytes)

sr pages scanned by clock algorithm

disk ( operations per second )

There are slots for up to four disks, labeled with a single letter and number.

The letter indicates the type of disk (s = SCSI, i = IPI, etc) . The number is

the logical unit number.

faults

in (non clock) device interrupts

sy system calls

cs CPU context switches

cpu - breakdown of percentage usage of CPU time. On multiprocessors this is an a

verage across all processors.

us user time

sy system time

id idle time

结果和解决方案:

A. CPU issues:

下面几列需要被察看，以确定cpu是否有问题

Processes in the run queue (procs r)

User time (cpu us)

System time (cpu sy)

Idle time (cpu id)

procs cpu

r b w us sy id

0 0 0 4 14 82

0 0 1 3 35 62

0 0 1 3 33 64

0 0 1 1 21 78

问题情况:

1.) 如果processes in run queue (procs r)的数量远大于系统中cpu的数量，将会使系统便慢。

2.) 如果这个数量是cpu的4倍的话，说明系统正面临cpu能力短缺,这将使系统运行速度大幅度降低

3.) 如果cpu的idle时间经常为0的话，或者系统占用时间(cpu sy)是用户占用时间(cpu us)两辈的话，系统面临缺少cpu资源

解决方案 :

解决这些情况，涉及到调整应用程序，使其能更有效的使用cpu，同时增加cpu的能力或数量。

B. Memory Issues:

内存的瓶颈取决于scan rate (sr) 。scan rate是每秒时钟对页的扫描（he scan rate is the pages scanned by the clock algorithm per

second.）如果 scan rate (sr)一直大于200 pages每秒，这时就是内存短缺的现实。

解决方案 :

1. 调整应用和服务器，使其能更好的使用memory和cache

2. 增加系统内存

3. Implement priority paging in s in pre solaris 8 versions by adding line "set priority paging=1" in

/etc/system. Remove this line if upgrading from Solaris 7 to 8 & retaining old /etc/system file.

Network Statistics (netstat)

netstat通过选项来显示不通的网络相关的数据结构（netstat displays the contents of various network-related data structures

in depending on the options selected.）

语法 :

netstat <option/s>

多个选项可以同时使用

Options

-a - displays the state of all sockets.

-r - shows the system routing tables

-i - gives statistics on a per-interface basis.

-m - displays information from the network memory buffers. On Solaris, this shows statistics

forSTREAMS

-p [proto] - retrieves statistics for the specified protocol

-s - shows per-protocol statistics. (some implementations allow -ss to remove fileds with a value of 0 (zero) from the

display.)

-D - display the status of DHCP configured interfaces.

-n do not lookup hostnames, display only IP addresses.

-d (with -i) displays dropped packets per interface.

-I [interface] retrieve information about only the specified interface.

-v be verbose

interval - number for continuous display of statictics.

Example :

$netstat -rn

Routing Table: IPv4

Destination Gateway Flags Ref Use Interface

-------------------- -------------------- ----- ----- ------ ---------

192.168.1.0 192.168.1.11 U 1 1444 le0

224.0.0.0 192.168.1.11 U 1 0 le0

default 192.168.1.1 UG 1 68276

127.0.0.1 127.0.0.1 UH 1 10497 lo0

这个显示了一台solaris机器的ip为192.168.1.11，他的默认网关192.168.1.1

结果和解决方案:

A.)网络连通

上面的命令经常用来检查网络的连通性。特别是当机器无法访问外部网络时：

1. 默认路由ip是否正确

2. 能否ping通网关

3. 如果网关地址不对，可以使用route add命令，参考 man route

route command examples:

$route add default <hostname>

$route add 192.0.2.32 <gateway_name>

If the router address is correct but still you can't ping it there may be some network cable /hub/switch problem and you

have to try and eliminate the faulty component .

B.) 网络反映

$ netstat -i

Name Mtu Net/Dest Address Ipkts Ierrs Opkts Oerrs Collis Queue

lo0 8232 loopback localhost 77814 0 77814 0 0 0

hme0 1500 server1 server1 10658566 3 4832511 0 279257 0

这个选项用于诊断网络问题，网络连通但连接速度慢

Values to look at:

Collisions (Collis)

Output packets (Opkts)

Input errors (Ierrs)

Input packets (Ipkts)

The above values will give information to workout

i. Network collision rate(网络冲突率)如下 :

Network collision rate = Output collision counts / Output packets

网络冲突率大于10%就显示，网络负载过大、网络配置不正确、硬件问题

ii. Input packet error rate（进入包错误率）如下 :

Input Packet Error Rate = Ierrs / Ipkts.

如果input error rate高（ver 0.25 percent），这个主机就正在丢包。hub/switch 连线就需要被检查是否存在潜在的问题。

C. Network socket & TCP Cconnection state

netstat给出了关于网络socket和tcp state的重要信息。只是非常有用的，在寻找开放、关闭和等待网络tcp连接。

netstat返回如下的网络状态 :

CLOSED ---- Closed. The socket is not being used.

LISTEN ---- Listening for incoming connections.

SYN_SENT ---- Actively trying to establish connection.

SYN_RECEIVED ---- Initial synchronization of the connection under way.

ESTABLISHED ---- Conn

[1] [2] 下一页