1. Application I/O Behaviour & Performance Analysis – What should we expect from I/O Analytics ?

Be it a single compute node or multi-node compute cluster or even a full blown Data-center with multiple Clusters & Grids, everyone of them will have 2 things for sure :

1. Applications
2. Storage for the Applications
Now, every one of us who is associated with any of the above would surely have come across the below questions atleast a couple of times if not hundreds or thousands of times.

  • Application is running slow ? damn why ? no idea ?
  • Is the problem because of badly written Application Code ?
  • Is it the storage that is bottle-necked and making the application crawl ?
  • Is the storage not able to handle the Application’s I/O requests ?
  • Is it the network congestion due to which the Network Storage Server is serving I/Os much slowly ?
  • Are there enough disk spindles in the storage to serve the IOPS needed by the Application ? Or shall we invest more in spindles ?
  • Now-a-days, there’s a new kid on the block, the SSD. Shall we replace our complete backend storage layer with All-SSD solutions ? Or Hybrid solutions ?
  • Should we move the Application to the cloud ?

And numerous other such things come to our mind and we have no clue about the correct answer.
Those who are on the Application side will definitely refer the issue to the Storage folks.
As a Storage guy, we would start investigating about how our storage servers/devices are behaving with the Application workload.

Our Storage Industry is filled with lot of buzz words like Application I/O Analytics, Data Analytics, Storage Analytics, Performance Analysis, File Analytics, I/O Analytics, Storage monitoring and lot more. And a lot of tools promise that they have the ability to do one or more of the above. But is it true ? Well we will find out the truth about a couple of them.

The aim of any Storage Analytics data is to use & analyze them to arrive at a conclusion  about the behaviour and performance of the Storage/Application and make recommendations for improving the performance. Lets see how the popular tools available come to our rescue. The first tool we will use is iostat. Heard of it ..right….?

iostat is a very popular tool packaged with the sysstat utility in Linux
It is used to monitor the Disk I/O performance in terms of Blocks read or written.
The command when executed without any arguments, will display blocks read & written since the time the system booted

Note: iostat displays the CPU stats as well along with the I/O stats. But,we will focus on the I/O stats only

Performance of Local or Block Devices

There are various modes of output metrics that iostat provides

I/O metrics in terms of 512-byte blocks

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              20.06       650.13        75.82   12480434    1455464
sdc               5.04       116.79      2196.00    2242100   42156560
sdd               4.96       115.81      2193.09    2223204   42100552
sdb               0.06         0.51         0.00       9842          8
sde               0.13         1.07         0.00      20528          0
md127           549.42       231.92      4389.09    4452064   84257112
tps         :  Transfers per second issued to the device or I/O requests issued to the 
               device per second.
Blk_read/s  :  Number of 512-byte blocks read per second
Blk_wrtn/s  :  Number of 512-byte blocks written per second
Blk_read    :  The total number of blocks read
Blk_wrtn    :  The total number of blocks written

I/O metrics in terms of Kilobytes (iostat -k)

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              19.64       318.14        37.13    6242073     728476
sdc               4.94        57.14      1074.30    1121050   21078280
sdd               4.86        56.66      1072.88    1111602   21050276
sdb               0.06         0.25         0.00       4921          4
sde               0.13         0.52         0.00      10264          0
md127           537.56       113.46      2147.18    2226032   42128556
kB_read/s  :  Number of Kilobytes of data read per second
kB_wrtn/s  :  Number of Kilobytes of data written per second
kB_read    :  The total number of Kilobytes read
kB_wrtn    :  The total number of Kilobytes written

I/O metrics in terms of Megabytes (iostat -m)

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda              19.64         0.31         0.04       6095        711
sdc               4.93         0.06         1.05       1094      20584
sdd               4.86         0.06         1.05       1085      20556
sdb               0.06         0.00         0.00          4          0
sde               0.12         0.00         0.00         10          0
md127           537.45         0.11         2.10       2173      41141
MB_read/s :  Number of Megabytes of data read per second
MB_wrtn/s :  Number of Megabytes of data written per second
MB_read   :  The total number of Megabytes read
MB_wrtn   :  The total number of Megabytes written

All the above metrics are very basic in nature and would only provide us with the answers to the basic questions like

  • Is the workload read intensive or write intensive Percentage of read & write
  • Which disk is heavily loaded & which one is least loaded
  • How heavy is the I/O workload, continuous or bursts

But, to analyze the real performance problem, we have to drill down to the next level of metrics. For that we need to run iostat with the “interval” argument.
This displays few more advanced I/O metrics which can help us deduce the behaviour of the I/O & storage

iostat -x 2

Here, it will display the metrics for every 2 seconds i.e. its actually the difference of the metrics between the current time and 2 seconds back

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.61     9.28   21.02    1.20   720.37    83.89    36.20     0.31   14.12   2.93   6.52
sdc               0.13   298.99    0.42    5.17   129.41  2433.28   458.45     3.40  607.49   5.03   2.81
sdd               0.11   298.56    0.30    5.20   128.32  2430.05   465.05     1.65  299.37   4.54   2.50
sdb               0.01     0.00    0.06    0.00     0.57     0.00     9.10     0.00    0.98   0.98   0.01
sde               0.01     0.00    0.14    0.00     1.18     0.00     8.37     0.00    0.50   0.50   0.01
md127             0.00     0.00    0.86  607.92   256.97  4863.33     8.41     0.00    0.00   0.00   0.00
rrqm/s    : Number of merged read requests that were issued to the device. A large rrqm value signifies Sequential Read
wrqm/s    : Number of merged write requests that were issued to the device. A large wrqm value signifies Sequential Write
r/s       : Number of read requests that were issued to the device per second. A larger r/s value signifies a Read 
            Intensive I/O
w/s       : Number of write requests that were issued to the device per second. A larger r/s value signifies a Write 
            Intensive I/O
rsec/s    : Number of sectors read from the device per second
wsec/s    : Number of sectors written to the device per second
avgrq-sz  : Average size of the requests (in number of sectors) that were issued tot he device. A larger request size 
            signifies a Sequential I/O
avgqu-sz  : Average queue length of the requests issued to the device. If the queue is large, then it means Heavy I/O is 
            But, at the same time if %util is low then we can assume that its Burst I/O
await     : Average time in milliseconds for I/O requests to be served (Time spent in queue + Time taken to service them)
svctm     : Obsolete field - Don't use this anymore
%util     : Percentage of CPU time during which I/O requests were issued. If this is close to 100% percent for a good 
            amount of time, then that's a clear indication of a saturated/bottle-necked disk

Analytics for a Network Storage / Filesystem

iostat has minimal support for NFS or Network shares as well.

Filesystem:          rBlk_nor/s   wBlk_nor/s   rBlk_dir/s   wBlk_dir/s   rBlk_svr/s   wBlk_svr/s     ops/s    rops/s    wops/s    120832.00      0.00         0.00         0.00       119808.00     0.00         60.00    58.50      0.00

Filesystem:          rMB_nor/s    wMB_nor/s    rMB_dir/s    wMB_dir/s    rMB_svr/s    wMB_svr/s     ops/s     rops/s    wops/s     84.50         0.00         0.00         0.00        85.00         0.00       77.50     85.00      0.00
rBlk_nor/s  : Number of 512 byte blocks read per second by Application using read system call
wBlk_nor/s  : Number of 512 byte blocks written per second by Application using write system call
rBlk_dir/s  : Number of 512 byte blocks read per second by Application using direct io read
wBlk_dir/s  : Number of 512 byte blocks written per second by Application using direct io write
rBlk_svr/s  : Number of blocks read per second from the NFS server using NFS Reads
wBlk_svr/s  : Number of blocks written per second to the NFS server using NFS Writes
ops/s       : Number of operations issued to the Filesystem per second
rops/s      : Number of read operations issued to the Filesystem per second
wops/s      : Number of write operations issued to the Filesystem per second

The above output can also be seen in terms of Kilobytes or Megabytes instead of 512 byte blocks, as shown in the 2nd output, where metrics rMB_nor/s corresponds to the  rBlk_nor/s and so on.

How helpful is iostat ? Is the block level analytical data displayed above enough for our storage thirsty souls ?

Now, these metrics provide us answers to the questions like :

  • Is the I/O sequential or random
  • Are the disks able to handle the I/O load effectively or get saturated at a point
  • Read intensive or Write intensive I/O
  • Latency of each I/O request. Are the I/Os being served at an acceptable rate/throughput or
  • each I/O request is taking too much time thereby resulting in bad throughput
  • Which disks are utilized heavily and which are not at all ?
  • Do we need striping by adding more spindles ?

That’s it ? Any analytical data above the disk layer ? Like files, directories, IOPS ?? Well sadly no…

Are the metrics provided by iostat enough to Analyze Application & Storage behaviour in a complex environment ?

Now, many of the Storage Analysts would be satisfied with the above analysis. But, there are a major number of them  who would consider the above Analytics information as bare minimal and look for more data points like the below. The Analytical data that we saw above, only tell us about the Application’s I/O in terms of Blocks read and written and latency of each I/O request. It has no clue about answers to our even deeper questions. Does it ? No way….

Can we somehow find these ?

  1. Which files were accessed during the I/O Workload, we need the details ? the exact file path
  2. What were the I/O operations done on them ? read, write or both
  3. What filesystem operations were done ? Were there too many open, close, link, symlink, unlink etc ?
  4. How many files were accessed during the I/O ? One large file or thousands of small files ?
  5. Which all files were re-read or re-written multiple times ?
  6. Which all files were written and never read again ?
  7. Which all files were only read ?
  8. Which areas of each file were read/written to ? Or the whole file was accessed ? Hot areas of the file ?
  9. Does the I/O involve a lot of Attribute operations or only plain read and write ?
  10. Throughput, IOPS, Latency at different levels like starting from the Mount itself until each file level.
  11. For any file, we should be able to see its individual IOPS, Throughput, Latency for Read & Write
  12. Which processes were doing what kind of I/O and how much ? Like some processes doing read , some other doing write on the file ?
  13. We need to know how exactly a process(s) is/are accessing a file
  14. What about Process Groups ? Which process group(s) are accessing what files and their I/O analytics ?
  15. Even a larger subset i.e. Sessions in the System – Which Sessions are acting on what all files and their Analytics  ?  Like Gnome Desktop Environment Session is acting on what all files and what all operations are done on them ?
  16. Resident data set size of an Application Workload ? Does the application access data in MBs or GBs or TBs ?
  17. Does it access a large amount of data once only or a small amount of data but thousands of times ?
  18. The Process tree hierarchy of the Processes doing I/O. It should be clear as to which process exactly is doing what I/O

What about Applications running on multiple nodes and accessing a set of same data ?

  1. How do we get consolidated Analytics for all the Nodes ?
  2. If multiple compute nodes have mounted the same NFS share and are accessing one or more common files, then can we get the total I/O metrics on those files ? And individual as well from each   compute node ?
  3. How about I/O Analytics based on Autofs Environment ? We need all I/O metrics for all mounts under an Autofs
  4. Which files they access, what I/O operation is done, Throughput, IOPS, Latency etc.
  5. Now, how about if we want to know how the backend NFS servers are behaving ? We need IOPS, Latency, Throughput, Files accessed from them, their metrics and everything related to     each Storage Box.

Now, lets talk about some state-of-the-art Application & Storage Analytics requirements

In a group of one or more compute nodes running multiple applications, accessing files from Multiple NFS Filers, can we find out the :

  1. The effective IOPS, Latency, Throughput, Read/Write done by each process
  2. The files accessed during I/O by each process – again in each node and aggregated in all nodes
  3. The IOPS, Latency, Throughput, Read/Write done for each file
  4. The effective IOPS, Latency, Throughput, Read/Write achieved by each compute node
  5. The effective IOPS, Latency, Throughput, Read/Write of each backend Storage Server / Filer
  6. Imagine how good it would be if all the above Analytics data are  available to us  in 2 formats i.e. for each node as well as aggregated in all nodes.
  7. Can we get Analytics data for a selected group of Applications or selected group of  Backend Filers. For example, we can select 5-10 applications and see how they are behaving, the effective IOPS, Throughput, Latency they are getting, the Read/Write done by them, files being accessed by them and all other such information which will help us fine tune them further

Let me stop here, as I can see its too much to ask for any Utility or Application/Storage Analytical Tool.

Now, does any of you know of any tool/utility/software by which we can get the answers to the questions above ? If not all, any tool/utility which can provide answers to atleast a few of the above questions ? Let me know about it in the comments section.

Well, we have one such tool which can answer all of the above questions….yes …all of the above and even more…Want to know which one ? Stay tuned, will reveal in the the next blog very soon.

Share it

Leave a Reply

Be the First to Comment!

Leave a Reply