Causes for inefficient I/O?

up vote
1
down vote

favorite

I'm rsync'ing a huge folder from an external to an internal 3,5" HDD, both 5.400 rpm. When using dstat to have a look at the current throughput, I regularly see patterns like this:

--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read writ| recv send| in out | int csw 
 20 6 34 40 0| 98M 90M|1272B 3652B| 0 0 |1702 4455 
 21 6 37 37 0| 121M 90M|1646B 4678B| 0 0 |2057 6488 
 17 24 29 30 0| 77M 95M| 630B 2416B| 0 0 |1581 4644 
 20 5 33 43 0| 86M 84M|1372B 2980B| 0 0 |1560 4146 
 20 6 30 44 0| 80M 75M| 700B 2722B| 0 0 |1484 3942 
 11 2 47 39 0| 39M 65M| 642B 1332B| 0 0 | 765 1507 
 0 0 50 50 0| 0 91M| 70B 354B| 0 0 | 136 70 
 0 0 50 49 0| 0 71M| 306B 346B| 0 0 | 146 119 
 0 0 50 50 0| 0 83M| 70B 346B| 0 0 | 145 60 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 36 84 
 0 0 50 50 0| 0 0 | 164B 646B| 0 0 | 35 71 
 0 0 50 50 0| 0 0 | 140B 802B| 0 0 | 30 64 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 27 68 
 0 0 50 50 0| 0 34M| 134B 346B| 0 0 | 86 68 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 30 71 
 0 0 50 50 0| 0 0 |2320B 346B| 0 0 | 40 76 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 29 71 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 25 50 
 -----------------------------[snip]------------------------------
 0 0 50 50 0| 0 0 |2230B 346B| 0 0 | 35 61 
 0 0 50 50 0| 0 60M| 70B 346B| 0 0 | 118 83 
 1 7 42 50 0| 256k 104M| 230B 500B| 0 0 | 281 480 
 21 5 31 42 0| 117M 76M|1120B 3392B| 0 0 |1849 4309 
 23 5 36 36 0| 137M 56M|1202B 3958B| 0 0 |2149 5782 
 24 5 36 35 0| 138M 100M|1284B 4112B| 0 0 |2174 6021

Say, for several seconds up to a minute, both read and write throughput drop to zero. What's the bottleneck here?

I mean, since both drives are about the same speed, none of them should be idle for too long. Even further, at least one drive should be always reading or writing. What is the system waiting for?

System is idle, only thing eating cpu is the rsync task. Memory is 8GB, CPU is a 7th-gen i5 quad-core. The internal HDD is hooked via SATA to the mainboard, a Gigabyte G170X-Ultra Gaming. Filesystem is ext4 in both cases, encrypted with dmcrypt/LUKS on the internal (write) side. Might that be the cause? If so, how to check the performance of dmcrypt? I see, CPU is 50% idle 50% waiting when the transfer drops occur. What may I conclude from that?

It's an up-to-date-Archlinux with kernel version 4.13.11-1-ARCH. Anything to look out for? Thanks in advance.

UPDATE: iotop was pointed out to be more accurate than dstat. Unfortunately, iotop shows zero thoughput as well when dstat drops to zero. I've done a screencast to show it.

edited Nov 4 '17 at 22:07

asked Nov 4 '17 at 20:12

LukeLR

1305

I think your external hdd is not in the best condition.
â€“Â Ipor Sircer
Nov 4 '17 at 21:39

You can try to grab a stack trace of your rsync process during an IO pause to get some data on what it's trying to do at that time. FWIW, rsync is an extremely inefficient way to transfer data - it has to do a lot of IO operations just to identify what needs to be synced before even starting to copy data. Something like ZFS send/receive of incremental snapshots is much more efficient because the filesystem keeps track of changes as they are made - there's no crawling two entire filesystems trying to find what needs to be sent.
â€“Â Andrew Henle
Nov 5 '17 at 14:36

add a commentÂ |Â

up vote
1
down vote

favorite

I'm rsync'ing a huge folder from an external to an internal 3,5" HDD, both 5.400 rpm. When using dstat to have a look at the current throughput, I regularly see patterns like this:

--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read writ| recv send| in out | int csw 
 20 6 34 40 0| 98M 90M|1272B 3652B| 0 0 |1702 4455 
 21 6 37 37 0| 121M 90M|1646B 4678B| 0 0 |2057 6488 
 17 24 29 30 0| 77M 95M| 630B 2416B| 0 0 |1581 4644 
 20 5 33 43 0| 86M 84M|1372B 2980B| 0 0 |1560 4146 
 20 6 30 44 0| 80M 75M| 700B 2722B| 0 0 |1484 3942 
 11 2 47 39 0| 39M 65M| 642B 1332B| 0 0 | 765 1507 
 0 0 50 50 0| 0 91M| 70B 354B| 0 0 | 136 70 
 0 0 50 49 0| 0 71M| 306B 346B| 0 0 | 146 119 
 0 0 50 50 0| 0 83M| 70B 346B| 0 0 | 145 60 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 36 84 
 0 0 50 50 0| 0 0 | 164B 646B| 0 0 | 35 71 
 0 0 50 50 0| 0 0 | 140B 802B| 0 0 | 30 64 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 27 68 
 0 0 50 50 0| 0 34M| 134B 346B| 0 0 | 86 68 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 30 71 
 0 0 50 50 0| 0 0 |2320B 346B| 0 0 | 40 76 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 29 71 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 25 50 
 -----------------------------[snip]------------------------------
 0 0 50 50 0| 0 0 |2230B 346B| 0 0 | 35 61 
 0 0 50 50 0| 0 60M| 70B 346B| 0 0 | 118 83 
 1 7 42 50 0| 256k 104M| 230B 500B| 0 0 | 281 480 
 21 5 31 42 0| 117M 76M|1120B 3392B| 0 0 |1849 4309 
 23 5 36 36 0| 137M 56M|1202B 3958B| 0 0 |2149 5782 
 24 5 36 35 0| 138M 100M|1284B 4112B| 0 0 |2174 6021

Say, for several seconds up to a minute, both read and write throughput drop to zero. What's the bottleneck here?

I mean, since both drives are about the same speed, none of them should be idle for too long. Even further, at least one drive should be always reading or writing. What is the system waiting for?

It's an up-to-date-Archlinux with kernel version 4.13.11-1-ARCH. Anything to look out for? Thanks in advance.

UPDATE: iotop was pointed out to be more accurate than dstat. Unfortunately, iotop shows zero thoughput as well when dstat drops to zero. I've done a screencast to show it.

edited Nov 4 '17 at 22:07

asked Nov 4 '17 at 20:12

LukeLR

1305

I think your external hdd is not in the best condition.
â€“Â Ipor Sircer
Nov 4 '17 at 21:39

You can try to grab a stack trace of your rsync process during an IO pause to get some data on what it's trying to do at that time. FWIW, rsync is an extremely inefficient way to transfer data - it has to do a lot of IO operations just to identify what needs to be synced before even starting to copy data. Something like ZFS send/receive of incremental snapshots is much more efficient because the filesystem keeps track of changes as they are made - there's no crawling two entire filesystems trying to find what needs to be sent.
â€“Â Andrew Henle
Nov 5 '17 at 14:36

add a commentÂ |Â

up vote
1
down vote

favorite

I'm rsync'ing a huge folder from an external to an internal 3,5" HDD, both 5.400 rpm. When using dstat to have a look at the current throughput, I regularly see patterns like this:

--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read writ| recv send| in out | int csw 
 20 6 34 40 0| 98M 90M|1272B 3652B| 0 0 |1702 4455 
 21 6 37 37 0| 121M 90M|1646B 4678B| 0 0 |2057 6488 
 17 24 29 30 0| 77M 95M| 630B 2416B| 0 0 |1581 4644 
 20 5 33 43 0| 86M 84M|1372B 2980B| 0 0 |1560 4146 
 20 6 30 44 0| 80M 75M| 700B 2722B| 0 0 |1484 3942 
 11 2 47 39 0| 39M 65M| 642B 1332B| 0 0 | 765 1507 
 0 0 50 50 0| 0 91M| 70B 354B| 0 0 | 136 70 
 0 0 50 49 0| 0 71M| 306B 346B| 0 0 | 146 119 
 0 0 50 50 0| 0 83M| 70B 346B| 0 0 | 145 60 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 36 84 
 0 0 50 50 0| 0 0 | 164B 646B| 0 0 | 35 71 
 0 0 50 50 0| 0 0 | 140B 802B| 0 0 | 30 64 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 27 68 
 0 0 50 50 0| 0 34M| 134B 346B| 0 0 | 86 68 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 30 71 
 0 0 50 50 0| 0 0 |2320B 346B| 0 0 | 40 76 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 29 71 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 25 50 
 -----------------------------[snip]------------------------------
 0 0 50 50 0| 0 0 |2230B 346B| 0 0 | 35 61 
 0 0 50 50 0| 0 60M| 70B 346B| 0 0 | 118 83 
 1 7 42 50 0| 256k 104M| 230B 500B| 0 0 | 281 480 
 21 5 31 42 0| 117M 76M|1120B 3392B| 0 0 |1849 4309 
 23 5 36 36 0| 137M 56M|1202B 3958B| 0 0 |2149 5782 
 24 5 36 35 0| 138M 100M|1284B 4112B| 0 0 |2174 6021

Say, for several seconds up to a minute, both read and write throughput drop to zero. What's the bottleneck here?

I mean, since both drives are about the same speed, none of them should be idle for too long. Even further, at least one drive should be always reading or writing. What is the system waiting for?

It's an up-to-date-Archlinux with kernel version 4.13.11-1-ARCH. Anything to look out for? Thanks in advance.

UPDATE: iotop was pointed out to be more accurate than dstat. Unfortunately, iotop shows zero thoughput as well when dstat drops to zero. I've done a screencast to show it.

edited Nov 4 '17 at 22:07

asked Nov 4 '17 at 20:12

LukeLR

1305

I'm rsync'ing a huge folder from an external to an internal 3,5" HDD, both 5.400 rpm. When using dstat to have a look at the current throughput, I regularly see patterns like this:

--total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai stl| read writ| recv send| in out | int csw 
 20 6 34 40 0| 98M 90M|1272B 3652B| 0 0 |1702 4455 
 21 6 37 37 0| 121M 90M|1646B 4678B| 0 0 |2057 6488 
 17 24 29 30 0| 77M 95M| 630B 2416B| 0 0 |1581 4644 
 20 5 33 43 0| 86M 84M|1372B 2980B| 0 0 |1560 4146 
 20 6 30 44 0| 80M 75M| 700B 2722B| 0 0 |1484 3942 
 11 2 47 39 0| 39M 65M| 642B 1332B| 0 0 | 765 1507 
 0 0 50 50 0| 0 91M| 70B 354B| 0 0 | 136 70 
 0 0 50 49 0| 0 71M| 306B 346B| 0 0 | 146 119 
 0 0 50 50 0| 0 83M| 70B 346B| 0 0 | 145 60 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 36 84 
 0 0 50 50 0| 0 0 | 164B 646B| 0 0 | 35 71 
 0 0 50 50 0| 0 0 | 140B 802B| 0 0 | 30 64 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 27 68 
 0 0 50 50 0| 0 34M| 134B 346B| 0 0 | 86 68 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 30 71 
 0 0 50 50 0| 0 0 |2320B 346B| 0 0 | 40 76 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 29 71 
 0 0 50 50 0| 0 0 | 70B 346B| 0 0 | 25 50 
 -----------------------------[snip]------------------------------
 0 0 50 50 0| 0 0 |2230B 346B| 0 0 | 35 61 
 0 0 50 50 0| 0 60M| 70B 346B| 0 0 | 118 83 
 1 7 42 50 0| 256k 104M| 230B 500B| 0 0 | 281 480 
 21 5 31 42 0| 117M 76M|1120B 3392B| 0 0 |1849 4309 
 23 5 36 36 0| 137M 56M|1202B 3958B| 0 0 |2149 5782 
 24 5 36 35 0| 138M 100M|1284B 4112B| 0 0 |2174 6021

Say, for several seconds up to a minute, both read and write throughput drop to zero. What's the bottleneck here?

I mean, since both drives are about the same speed, none of them should be idle for too long. Even further, at least one drive should be always reading or writing. What is the system waiting for?

It's an up-to-date-Archlinux with kernel version 4.13.11-1-ARCH. Anything to look out for? Thanks in advance.

UPDATE: iotop was pointed out to be more accurate than dstat. Unfortunately, iotop shows zero thoughput as well when dstat drops to zero. I've done a screencast to show it.

edited Nov 4 '17 at 22:07

asked Nov 4 '17 at 20:12

LukeLR

1305

edited Nov 4 '17 at 22:07

asked Nov 4 '17 at 20:12

LukeLR

1305

asked Nov 4 '17 at 20:12

LukeLR

1305

asked Nov 4 '17 at 20:12

LukeLR

1305

I think your external hdd is not in the best condition.
â€“Â Ipor Sircer
Nov 4 '17 at 21:39

You can try to grab a stack trace of your rsync process during an IO pause to get some data on what it's trying to do at that time. FWIW, rsync is an extremely inefficient way to transfer data - it has to do a lot of IO operations just to identify what needs to be synced before even starting to copy data. Something like ZFS send/receive of incremental snapshots is much more efficient because the filesystem keeps track of changes as they are made - there's no crawling two entire filesystems trying to find what needs to be sent.
â€“Â Andrew Henle
Nov 5 '17 at 14:36

add a commentÂ |Â

I think your external hdd is not in the best condition.
â€“Â Ipor Sircer
Nov 4 '17 at 21:39

You can try to grab a stack trace of your rsync process during an IO pause to get some data on what it's trying to do at that time. FWIW, rsync is an extremely inefficient way to transfer data - it has to do a lot of IO operations just to identify what needs to be synced before even starting to copy data. Something like ZFS send/receive of incremental snapshots is much more efficient because the filesystem keeps track of changes as they are made - there's no crawling two entire filesystems trying to find what needs to be sent.
â€“Â Andrew Henle
Nov 5 '17 at 14:36

I think your external hdd is not in the best condition.
â€“Â Ipor Sircer
Nov 4 '17 at 21:39

You can try to grab a stack trace of your rsync process during an IO pause to get some data on what it's trying to do at that time. FWIW, rsync is an extremely inefficient way to transfer data - it has to do a lot of IO operations just to identify what needs to be synced before even starting to copy data. Something like ZFS send/receive of incremental snapshots is much more efficient because the filesystem keeps track of changes as they are made - there's no crawling two entire filesystems trying to find what needs to be sent.
â€“Â Andrew Henle
Nov 5 '17 at 14:36

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
4
down vote

accepted

There are 2 sets of tools to get some block-level device statistics. The first is iolatency from Brendan Gregg's perf tools. It produces a simple histogram of disk operation latency such as:

>=(ms) .. <(ms) : I/O |Distribution |
 0 -> 1 : 1913 |######################################|
 1 -> 2 : 438 |######### |
 2 -> 4 : 100 |## |
 4 -> 8 : 145 |### |
 8 -> 16 : 43 |# |
 16 -> 32 : 43 |# |
 32 -> 64 : 1 |# |

Another script in the toolset is iosnoop which shows commands and their operations, eg:

COMM PID TYPE DEV BLOCK BYTES LATms
/usr/bin/mon 31456 R 8,0 9741888 4096 2.14
/usr/bin/mon 31456 R 8,0 9751408 4096 0.16
/usr/bin/mon 31456 R 8,0 20022728 4096 1.44
/usr/bin/mon 31456 R 8,0 19851752 4096 0.26
jbd2/sda3-41 416 WS 8,0 130618232 65536 1.89
jbd2/sda3-41 416 WS 8,0 209996928 65536 1.92
jbd2/sda3-41 416 WS 8,0 210006528 8192 1.94

Then there is the blktrace package which records low-level block operations with blktrace and then shows all sorts of information with blkparse, and many other commands, including the simple summary from btt (pdf user guide):

$ sudo blktrace /dev/sda # ^C to stop
=== sda ===
 CPU 0: 180 events, 9 KiB data
 CPU 1: 1958 events, 92 KiB data
 Total: 2138 events (dropped 0), 101 KiB data
$ ls -ltra # one file per cpu
-rw-r--r-- 1 root root 8640 Nov 5 10:16 sda.blktrace.0
-rw-r--r-- 1 root root 93992 Nov 5 10:16 sda.blktrace.1
$ blkparse -O -d combined.output sda.blktrace.* # combine cpus
$ btt -i combined.output 
 ALL MIN AVG MAX N
Q2Q 0.000001053 0.106888548 6.376503027 253
Q2G 0.000000795 0.000002266 0.000011060 184
G2I 0.000000874 0.000979485 0.002588781 328
Q2M 0.000000331 0.000000599 0.000002716 70
I2D 0.000000393 0.000480112 0.002435491 328
M2D 0.000002044 0.000028418 0.000126845 70
D2C 0.000080986 0.001925224 0.010111418 254
Q2C 0.000087025 0.002603157 0.010120629 254
...

D2C, for example, is how long it takes the hardware device to do an operation.

You might also run sudo smartctl -a /dev/sda on each disc to see if there are any failures showing.

edited Nov 5 '17 at 13:43

answered Nov 5 '17 at 9:20

meuh

29.6k11751

add a commentÂ |Â

up vote
1
down vote

I guess that dstat uses I/O statistics on the file descriptor level i.e. the application calls write(), and as soon as the syscall returns the data which dstat sees increases.

But that does not mean that the data has actually been written. I guess that these seeming pauses are the phases in which the buffers are written to the block device. It makes sense that during these times the I/O wait value is even higher than during the phases in which dstat measures data transfer.

iotop tells apart writes to and reads from the disk and the cache. Maybe that tool can deliver interesing additional information.

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

Thanks for your reply! Unfortunately, iotop shows the same drops as dstat, see above...
â€“Â LukeLR
Nov 4 '17 at 22:08

@LukeLR You misunderstood me. When I/O blocks then neither dstat nor iotop should see a change. I meant: Before the pauses iotop should show that a lot of the writing is done to the cache.
â€“Â Hauke Laging
Nov 4 '17 at 22:19

Okay, thanks. But is there a way to confirm if the disk is still active during these zero-periods, or if it is waiting for some other bottleneck?
â€“Â LukeLR
Nov 4 '17 at 23:33

@LukeLR Unfortunately I do not know that. Maybe with the tools from the other answer. But another thought: If the buffers are being flushed during a pause then you would expect the size of the buffers (e.g. in top) to change.
â€“Â Hauke Laging
Nov 5 '17 at 9:46

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f402545%2fcauses-for-inefficient-i-o%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
4
down vote

accepted

There are 2 sets of tools to get some block-level device statistics. The first is iolatency from Brendan Gregg's perf tools. It produces a simple histogram of disk operation latency such as:

>=(ms) .. <(ms) : I/O |Distribution |
 0 -> 1 : 1913 |######################################|
 1 -> 2 : 438 |######### |
 2 -> 4 : 100 |## |
 4 -> 8 : 145 |### |
 8 -> 16 : 43 |# |
 16 -> 32 : 43 |# |
 32 -> 64 : 1 |# |

Another script in the toolset is iosnoop which shows commands and their operations, eg:

COMM PID TYPE DEV BLOCK BYTES LATms
/usr/bin/mon 31456 R 8,0 9741888 4096 2.14
/usr/bin/mon 31456 R 8,0 9751408 4096 0.16
/usr/bin/mon 31456 R 8,0 20022728 4096 1.44
/usr/bin/mon 31456 R 8,0 19851752 4096 0.26
jbd2/sda3-41 416 WS 8,0 130618232 65536 1.89
jbd2/sda3-41 416 WS 8,0 209996928 65536 1.92
jbd2/sda3-41 416 WS 8,0 210006528 8192 1.94

$ sudo blktrace /dev/sda # ^C to stop
=== sda ===
 CPU 0: 180 events, 9 KiB data
 CPU 1: 1958 events, 92 KiB data
 Total: 2138 events (dropped 0), 101 KiB data
$ ls -ltra # one file per cpu
-rw-r--r-- 1 root root 8640 Nov 5 10:16 sda.blktrace.0
-rw-r--r-- 1 root root 93992 Nov 5 10:16 sda.blktrace.1
$ blkparse -O -d combined.output sda.blktrace.* # combine cpus
$ btt -i combined.output 
 ALL MIN AVG MAX N
Q2Q 0.000001053 0.106888548 6.376503027 253
Q2G 0.000000795 0.000002266 0.000011060 184
G2I 0.000000874 0.000979485 0.002588781 328
Q2M 0.000000331 0.000000599 0.000002716 70
I2D 0.000000393 0.000480112 0.002435491 328
M2D 0.000002044 0.000028418 0.000126845 70
D2C 0.000080986 0.001925224 0.010111418 254
Q2C 0.000087025 0.002603157 0.010120629 254
...

D2C, for example, is how long it takes the hardware device to do an operation.

You might also run sudo smartctl -a /dev/sda on each disc to see if there are any failures showing.

edited Nov 5 '17 at 13:43

answered Nov 5 '17 at 9:20

meuh

29.6k11751

add a commentÂ |Â

up vote
4
down vote

accepted

There are 2 sets of tools to get some block-level device statistics. The first is iolatency from Brendan Gregg's perf tools. It produces a simple histogram of disk operation latency such as:

>=(ms) .. <(ms) : I/O |Distribution |
 0 -> 1 : 1913 |######################################|
 1 -> 2 : 438 |######### |
 2 -> 4 : 100 |## |
 4 -> 8 : 145 |### |
 8 -> 16 : 43 |# |
 16 -> 32 : 43 |# |
 32 -> 64 : 1 |# |

Another script in the toolset is iosnoop which shows commands and their operations, eg:

COMM PID TYPE DEV BLOCK BYTES LATms
/usr/bin/mon 31456 R 8,0 9741888 4096 2.14
/usr/bin/mon 31456 R 8,0 9751408 4096 0.16
/usr/bin/mon 31456 R 8,0 20022728 4096 1.44
/usr/bin/mon 31456 R 8,0 19851752 4096 0.26
jbd2/sda3-41 416 WS 8,0 130618232 65536 1.89
jbd2/sda3-41 416 WS 8,0 209996928 65536 1.92
jbd2/sda3-41 416 WS 8,0 210006528 8192 1.94

$ sudo blktrace /dev/sda # ^C to stop
=== sda ===
 CPU 0: 180 events, 9 KiB data
 CPU 1: 1958 events, 92 KiB data
 Total: 2138 events (dropped 0), 101 KiB data
$ ls -ltra # one file per cpu
-rw-r--r-- 1 root root 8640 Nov 5 10:16 sda.blktrace.0
-rw-r--r-- 1 root root 93992 Nov 5 10:16 sda.blktrace.1
$ blkparse -O -d combined.output sda.blktrace.* # combine cpus
$ btt -i combined.output 
 ALL MIN AVG MAX N
Q2Q 0.000001053 0.106888548 6.376503027 253
Q2G 0.000000795 0.000002266 0.000011060 184
G2I 0.000000874 0.000979485 0.002588781 328
Q2M 0.000000331 0.000000599 0.000002716 70
I2D 0.000000393 0.000480112 0.002435491 328
M2D 0.000002044 0.000028418 0.000126845 70
D2C 0.000080986 0.001925224 0.010111418 254
Q2C 0.000087025 0.002603157 0.010120629 254
...

D2C, for example, is how long it takes the hardware device to do an operation.

You might also run sudo smartctl -a /dev/sda on each disc to see if there are any failures showing.

edited Nov 5 '17 at 13:43

answered Nov 5 '17 at 9:20

meuh

29.6k11751

add a commentÂ |Â

up vote
4
down vote

accepted

There are 2 sets of tools to get some block-level device statistics. The first is iolatency from Brendan Gregg's perf tools. It produces a simple histogram of disk operation latency such as:

>=(ms) .. <(ms) : I/O |Distribution |
 0 -> 1 : 1913 |######################################|
 1 -> 2 : 438 |######### |
 2 -> 4 : 100 |## |
 4 -> 8 : 145 |### |
 8 -> 16 : 43 |# |
 16 -> 32 : 43 |# |
 32 -> 64 : 1 |# |

Another script in the toolset is iosnoop which shows commands and their operations, eg:

COMM PID TYPE DEV BLOCK BYTES LATms
/usr/bin/mon 31456 R 8,0 9741888 4096 2.14
/usr/bin/mon 31456 R 8,0 9751408 4096 0.16
/usr/bin/mon 31456 R 8,0 20022728 4096 1.44
/usr/bin/mon 31456 R 8,0 19851752 4096 0.26
jbd2/sda3-41 416 WS 8,0 130618232 65536 1.89
jbd2/sda3-41 416 WS 8,0 209996928 65536 1.92
jbd2/sda3-41 416 WS 8,0 210006528 8192 1.94

$ sudo blktrace /dev/sda # ^C to stop
=== sda ===
 CPU 0: 180 events, 9 KiB data
 CPU 1: 1958 events, 92 KiB data
 Total: 2138 events (dropped 0), 101 KiB data
$ ls -ltra # one file per cpu
-rw-r--r-- 1 root root 8640 Nov 5 10:16 sda.blktrace.0
-rw-r--r-- 1 root root 93992 Nov 5 10:16 sda.blktrace.1
$ blkparse -O -d combined.output sda.blktrace.* # combine cpus
$ btt -i combined.output 
 ALL MIN AVG MAX N
Q2Q 0.000001053 0.106888548 6.376503027 253
Q2G 0.000000795 0.000002266 0.000011060 184
G2I 0.000000874 0.000979485 0.002588781 328
Q2M 0.000000331 0.000000599 0.000002716 70
I2D 0.000000393 0.000480112 0.002435491 328
M2D 0.000002044 0.000028418 0.000126845 70
D2C 0.000080986 0.001925224 0.010111418 254
Q2C 0.000087025 0.002603157 0.010120629 254
...

D2C, for example, is how long it takes the hardware device to do an operation.

You might also run sudo smartctl -a /dev/sda on each disc to see if there are any failures showing.

edited Nov 5 '17 at 13:43

answered Nov 5 '17 at 9:20

meuh

29.6k11751

There are 2 sets of tools to get some block-level device statistics. The first is iolatency from Brendan Gregg's perf tools. It produces a simple histogram of disk operation latency such as:

>=(ms) .. <(ms) : I/O |Distribution |
 0 -> 1 : 1913 |######################################|
 1 -> 2 : 438 |######### |
 2 -> 4 : 100 |## |
 4 -> 8 : 145 |### |
 8 -> 16 : 43 |# |
 16 -> 32 : 43 |# |
 32 -> 64 : 1 |# |

Another script in the toolset is iosnoop which shows commands and their operations, eg:

COMM PID TYPE DEV BLOCK BYTES LATms
/usr/bin/mon 31456 R 8,0 9741888 4096 2.14
/usr/bin/mon 31456 R 8,0 9751408 4096 0.16
/usr/bin/mon 31456 R 8,0 20022728 4096 1.44
/usr/bin/mon 31456 R 8,0 19851752 4096 0.26
jbd2/sda3-41 416 WS 8,0 130618232 65536 1.89
jbd2/sda3-41 416 WS 8,0 209996928 65536 1.92
jbd2/sda3-41 416 WS 8,0 210006528 8192 1.94

$ sudo blktrace /dev/sda # ^C to stop
=== sda ===
 CPU 0: 180 events, 9 KiB data
 CPU 1: 1958 events, 92 KiB data
 Total: 2138 events (dropped 0), 101 KiB data
$ ls -ltra # one file per cpu
-rw-r--r-- 1 root root 8640 Nov 5 10:16 sda.blktrace.0
-rw-r--r-- 1 root root 93992 Nov 5 10:16 sda.blktrace.1
$ blkparse -O -d combined.output sda.blktrace.* # combine cpus
$ btt -i combined.output 
 ALL MIN AVG MAX N
Q2Q 0.000001053 0.106888548 6.376503027 253
Q2G 0.000000795 0.000002266 0.000011060 184
G2I 0.000000874 0.000979485 0.002588781 328
Q2M 0.000000331 0.000000599 0.000002716 70
I2D 0.000000393 0.000480112 0.002435491 328
M2D 0.000002044 0.000028418 0.000126845 70
D2C 0.000080986 0.001925224 0.010111418 254
Q2C 0.000087025 0.002603157 0.010120629 254
...

D2C, for example, is how long it takes the hardware device to do an operation.

You might also run sudo smartctl -a /dev/sda on each disc to see if there are any failures showing.

edited Nov 5 '17 at 13:43

answered Nov 5 '17 at 9:20

meuh

29.6k11751

edited Nov 5 '17 at 13:43

answered Nov 5 '17 at 9:20

meuh

29.6k11751

answered Nov 5 '17 at 9:20

meuh

29.6k11751

answered Nov 5 '17 at 9:20

meuh

29.6k11751

add a commentÂ |Â

up vote
1
down vote

I guess that dstat uses I/O statistics on the file descriptor level i.e. the application calls write(), and as soon as the syscall returns the data which dstat sees increases.

iotop tells apart writes to and reads from the disk and the cache. Maybe that tool can deliver interesing additional information.

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

Thanks for your reply! Unfortunately, iotop shows the same drops as dstat, see above...
â€“Â LukeLR
Nov 4 '17 at 22:08

@LukeLR You misunderstood me. When I/O blocks then neither dstat nor iotop should see a change. I meant: Before the pauses iotop should show that a lot of the writing is done to the cache.
â€“Â Hauke Laging
Nov 4 '17 at 22:19

Okay, thanks. But is there a way to confirm if the disk is still active during these zero-periods, or if it is waiting for some other bottleneck?
â€“Â LukeLR
Nov 4 '17 at 23:33

@LukeLR Unfortunately I do not know that. Maybe with the tools from the other answer. But another thought: If the buffers are being flushed during a pause then you would expect the size of the buffers (e.g. in top) to change.
â€“Â Hauke Laging
Nov 5 '17 at 9:46

add a commentÂ |Â

up vote
1
down vote

I guess that dstat uses I/O statistics on the file descriptor level i.e. the application calls write(), and as soon as the syscall returns the data which dstat sees increases.

iotop tells apart writes to and reads from the disk and the cache. Maybe that tool can deliver interesing additional information.

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

Thanks for your reply! Unfortunately, iotop shows the same drops as dstat, see above...
â€“Â LukeLR
Nov 4 '17 at 22:08

@LukeLR You misunderstood me. When I/O blocks then neither dstat nor iotop should see a change. I meant: Before the pauses iotop should show that a lot of the writing is done to the cache.
â€“Â Hauke Laging
Nov 4 '17 at 22:19

Okay, thanks. But is there a way to confirm if the disk is still active during these zero-periods, or if it is waiting for some other bottleneck?
â€“Â LukeLR
Nov 4 '17 at 23:33

@LukeLR Unfortunately I do not know that. Maybe with the tools from the other answer. But another thought: If the buffers are being flushed during a pause then you would expect the size of the buffers (e.g. in top) to change.
â€“Â Hauke Laging
Nov 5 '17 at 9:46

add a commentÂ |Â

up vote
1
down vote

I guess that dstat uses I/O statistics on the file descriptor level i.e. the application calls write(), and as soon as the syscall returns the data which dstat sees increases.

iotop tells apart writes to and reads from the disk and the cache. Maybe that tool can deliver interesing additional information.

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

I guess that dstat uses I/O statistics on the file descriptor level i.e. the application calls write(), and as soon as the syscall returns the data which dstat sees increases.

iotop tells apart writes to and reads from the disk and the cache. Maybe that tool can deliver interesing additional information.

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

answered Nov 4 '17 at 20:28

Hauke Laging

53.6k1282130

Thanks for your reply! Unfortunately, iotop shows the same drops as dstat, see above...
â€“Â LukeLR
Nov 4 '17 at 22:08

@LukeLR You misunderstood me. When I/O blocks then neither dstat nor iotop should see a change. I meant: Before the pauses iotop should show that a lot of the writing is done to the cache.
â€“Â Hauke Laging
Nov 4 '17 at 22:19

Okay, thanks. But is there a way to confirm if the disk is still active during these zero-periods, or if it is waiting for some other bottleneck?
â€“Â LukeLR
Nov 4 '17 at 23:33

@LukeLR Unfortunately I do not know that. Maybe with the tools from the other answer. But another thought: If the buffers are being flushed during a pause then you would expect the size of the buffers (e.g. in top) to change.
â€“Â Hauke Laging
Nov 5 '17 at 9:46

add a commentÂ |Â

Thanks for your reply! Unfortunately, iotop shows the same drops as dstat, see above...
â€“Â LukeLR
Nov 4 '17 at 22:08

@LukeLR You misunderstood me. When I/O blocks then neither dstat nor iotop should see a change. I meant: Before the pauses iotop should show that a lot of the writing is done to the cache.
â€“Â Hauke Laging
Nov 4 '17 at 22:19

Okay, thanks. But is there a way to confirm if the disk is still active during these zero-periods, or if it is waiting for some other bottleneck?
â€“Â LukeLR
Nov 4 '17 at 23:33

@LukeLR Unfortunately I do not know that. Maybe with the tools from the other answer. But another thought: If the buffers are being flushed during a pause then you would expect the size of the buffers (e.g. in top) to change.
â€“Â Hauke Laging
Nov 5 '17 at 9:46

Thanks for your reply! Unfortunately, iotop shows the same drops as dstat, see above...
â€“Â LukeLR
Nov 4 '17 at 22:08

@LukeLR You misunderstood me. When I/O blocks then neither dstat nor iotop should see a change. I meant: Before the pauses iotop should show that a lot of the writing is done to the cache.
â€“Â Hauke Laging
Nov 4 '17 at 22:19

Okay, thanks. But is there a way to confirm if the disk is still active during these zero-periods, or if it is waiting for some other bottleneck?
â€“Â LukeLR
Nov 4 '17 at 23:33

@LukeLR Unfortunately I do not know that. Maybe with the tools from the other answer. But another thought: If the buffers are being flushed during a pause then you would expect the size of the buffers (e.g. in top) to change.
â€“Â Hauke Laging
Nov 5 '17 at 9:46

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu