Linux操作系统跑Linux慢的分析

iostat的默认参数是tdc(terminal, disk, and CPU)。如果任何其他的选项被指定，这个默认参数将被完全替代，例如，iostat -d将只反映磁盘的统计结果。

　　语法:

　　基本语法： iostat interval count

　　option - 让你指定所需信息的设备，像磁盘、cpu或者终端(-d , -c , -t or -tdc ) 。x 选项给出了完整的统计结果（gives the extended

　　statistic）。

　　interval - 在两个samples之间的时间（秒）。

　　count - 就是需要统计几次

　　例子：

　　$ iostat -xtc 5 2

　　extended disk statistics tty cpu

　　disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id

　　sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0

　　sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23

　　sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0

　　sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31

　　The fields have the following meanings:

　　disk name of the disk

　　r/s reads per second

　　w/s writes per second

　　Kr/s kilobytes read per second

　　Kw/s kilobytes written per second

　　wait average number of transactions waiting for service (Q length)

　　actv average number of transactions actively

　　being serviced (removed from the

　　queue but not yet

　　completed)

　　%w percent of time there are transactions waiting

　　for service (queue non-empty)

　　%b percent of time the disk is busy (transactions

　　in progress)

　　Results and Solutions:

　　从iostat输出结果中需要注意的值：

　　Reads/writes per second (r/s , w/s)

　　Percentage busy (%b)

　　Service time (svc_t)

　　如果磁盘显示长时间的高reads/writes，并且磁盘的percentage busy (%b)也远大于5%，同时average service time (svc_t)也远大于30

　　milliseconds，这以下的措施需要被执行：

　　1.)调整应用，令其使用磁盘i/o更加有效率，可以通过修改磁盘队列、使用应用服务器的cache

　　2.)将文件系统分布到2个或多个磁盘上，并使用volume manager/disksuite的条带化特点

　　3.) 增加系统参数值，如inode cache , ufs_ninode。Increase the system parameter values for inode cache , ufs_ninode , which

　　is Number of inodes to be held in memory. Inodes are cached globally (for UFS), not on a per-file system basis

　　4.) 将文件系统移到更快的磁盘/控制器，或者用更好的设备来代替

　　vmstat - vmstat反映了进程的虚拟内存、虚拟内存、磁盘、trap(是不是翻译成中断？？)和cpu的活动情况

　　在多cpu系统中，vmstat在输出结果中平均了cpu数量。For per-process statistics .如果没有选项，vmstat显示一行虚拟内存活动的概要

　　结果，是从系统启动时开始的。

　　语法:

　　vmstat interval count

　　option - 让你指定所需的信息类型，例如 paging -p , cache -c ,.interrupt -i etc.

　　如果没有指定选项，将会显示进程、内存、页、磁盘、中断和cpu信息

　　interval - 同iostat

　　count - 同iostat

　　例子

　　The following command displays a summary of what the system

　　is doing every five seconds.

　　example% vmstat 5

　　procs memory page disk faults cpu

　　r b w swap free re mf pi p fr de sr s0 s1 s2 s3 in sy cs us sy id

　　0 0 0 11456 4120 1 41 19 1 3 0 2 0 4 0 0 48 112 130 4 14 82

　　0 0 1 10132 4280 0 4 44 0 0 0 0 0 23 0 0 211 230 144 3 35 62

　　0 0 1 10132 4616 0 0 20 0 0 0 0 0 19 0 0 150 172 146 3 33 64

　　0 0 1 10132 5292 0 0 9 0 0 0 0 0 21 0 0 165 105 130 1 21 78

　　The fields of vmstat's display are

　　procs

　　r in run queue

　　b blocked for resources I/O, paging etc.

　　w swapped

　　memory (in Kbytes)

　　swap - amount of swap space currently available

　　free - size of the free list

　　page ( in units per second).

　　re page reclaims - see -S option for how this field is modified.

　　mf minor faults - see -S option for how this field is modified.

　　pi kilobytes paged in

　　po kilobytes paged out

　　fr kilobytes freed

　　de anticipated short-term memory shortfall (Kbytes)

　　sr pages scanned by clock algorithm

　　disk ( operations per second )

　　There are slots for up to four disks, labeled with a single letter and number.

　　The letter indicates the type of disk (s = SCSI, i = IPI, etc) . The number is

　　the logical unit number.

　　faults

　　in (non clock) device interrupts

　　sy system calls

　　cs CPU context switches

　　cpu - breakdown of percentage usage of CPU time. On multiprocessors this is an a

　　verage across all processors.

　　us user time

　　sy system time

　　id idle time

　　结果和解决方案:

　　A. CPU issues:

　　下面几列需要被察看，以确定cpu是否有问题

　　Processes in the run queue (procs r)

　　User time (cpu us)

　　System time (cpu sy)

　　Idle time (cpu id)

　　procs cpu

　　r b w us sy id

　　0 0 0 4 14 82

　　0 0 1 3 35 62

　　0 0 1 3 33 64

　　0 0 1 1 21 78

　　问题情况:

　　1.) 如果processes in run queue (procs r)的数量远大于系统中cpu的数量，将会使系统便慢。

　　2.) 如果这个数量是cpu的4倍的话，说明系统正面临cpu能力短缺,这将使系统运行速度大幅度降低

　　3.) 如果cpu的idle时间经常为0的话，或者系统占用时间(cpu sy)是用户占用时间(cpu us)两辈的话，系统面临缺少cpu资源

　　解决方案 :

　　解决这些情况，涉及到调整应用程序，使其能更有效的使用cpu，同时增加cpu的能力或数量。

　　B. Memory Issues:

　　内存的瓶颈取决于scan rate (sr) 。scan rate是每秒时钟对页的扫描（he scan rate is the pages scanned by the clock algorithm per

　　second.）如果 scan rate (sr)一直大于200 pages每秒，这时就是内存短缺的现实。

　　解决方案 :

　　1. 调整应用和服务器，使其能更好的使用memory和cache

　　2. 增加系统内存

　　dmidecode类似AIX的lsdev，所有的设备基本都可以看到。