Some while ago, I’m facing a performance related issues on a server as it is showing server load high and MySQL performs badly. After the quick look, I found that server load is bit of high but handy even though MySQL response gets badly affected.
So I had asked to check wheather it’s something related to Hardware resource like broken RAID, or any low perform machine interrupts.
1. Check the disk is healthy or NOT
hdparm is a linux stress tool which will write a particular volume in a given drive and count the time taken for reading that file. based on the computation, it will display the disk performance.
Timing cached reads: 15078 MB in 2.00 seconds = 7551.32 MB/sec
Timing buffered disk reads: 178 MB in 3.08 seconds = 57.82 MB/sec
This shows that currently disk is badly used. So I has to be more concern about the activity happening on the server. So I used the command top and load average variables
1. Observing system statistics using TOP command
us —–> user cpu time (or) % CPU time spent in user space
sy —–> system cpu time (or) % CPU time spent in kernel space
ni —–> user nice cpu time (or) % CPU time spent on low priority processes
id —–> idle cpu time (or) % CPU time spent idle
wa —–> io wait cpu time (or) % CPU time spent in wait (on disk). Id this value shows higher, then there is something wrong with disk performance.
hi —–> hardware irq (or) % CPU time spent servicing/handling hardware interrupts
si —–> software irq (or) % CPU time spent servicing/handling software interrupts
st —–> steal time – – % CPU time in involuntary wait by virtual cpu while hypervisor is servicing another processor (or) % CPU time stolen from a virtual machine
The main variables are us%,sy% and wa%.
a. If us% values shows 70% of total no. of CPUs , then server will become very slow. %us and %sy, represent percentage of CPU time spent in user mode and kernel mode, respectively.
b. if wa% is become high, there should be something bad happens on disk I/O operations. So entire performance will become slower.
c. A high %hi value means one or more devices are too busy doing their work, and are most likely overloaded. In certain cases, it might mean the device is broken, so it’s good to do a thorough checking before it gets worse. Check /proc/interrupts and you can pinpoint the source.
d. A high number here doesn’t always related to high frequency of real interrupts. Dividing work between the real interrupt handler and soft IRQs is the driver’s job, so high %si tends to show there’s something needed to optimize inside the driver. The easiest way to optimize is upgrade your kernel.
2. How to watch the CPU utilization of each cores
3. How to sort the programs based on the process usage
Execute top and and hit “Shift +P“. Now you can see the each CPU’s statistics.
4. How to list the process based on the memory consumption
Execute top and and hit “Shift + M “. Now you can see the each CPU’s statistics.
5.View all the processes running by a user
Press “u” on the terminal when top is running. You should get a prompt as shown below in blue color.
Tasks: 434 total, 1 running, 433 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 49409536k total, 33289556k used, 16119980k free, 11032k buffers
Swap: 16777208k total, 130732k used, 16646476k free, 85724k cached
Which user (blank for all):mysql
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15867 mysql 20 0 30.0g 22g 6156 S 0.3 47.3 2718:30 mysqld