Backing up to a local SSD taking a lot longer than expected
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
On a fresh Ubuntu 18.04 system I decided to give Deja-Dup a go. It's just a GUI for Duplicity (which uses rsync). About 850GB of data needed to be backed up. Source SSD was NVMe. Destination SSD was SATA. Initial (full) backup took about 7 hours.
The next day I did nothing more than check mail and install an application â which added ~230MB â before running Deja-Dup again.
Deja-Dup ran for ~39 minutes loading a single core at 35âÂÂ70% for the entire duration.
Duplicity was invoked three times:
- The Scanning phase pegged a core at 100% for 18 minutes.
- The Backing Up phase pegged a core at 100% for 16 minutes.
- The Verifying phase pegged a core at 100% for 5 minutes.
This is on a new computer with plenty of RAM. The backup was not encrypted.
Now, I expect initial (full) backups to take a while, and that's fine. What I don't expect is for ~230MB of new data to take ~39 minutes to be backed up (and consume over a collective core-hour of CPU time).
Is something wrong or broken? Should incremental backups of a couple of hundred megabytes take that much time to perform? I was expecting something under 5 minutes, not ~39. Why is it taking so long?
(If I had a spare 1TB SATA SSD I'd hook it up and just rsync the data straight across to see if that is noticeably faster â but unfortunately I do not.)
Update 1: I ran a manual backup after negligible changes (a few KB of new mail) and the time taken was the same (~39 minutes). Thus the time taken seems to have little to do with the amount of new data that needs to be backed up.
Update 2: Monitoring with iotop revealed that the Scanning phase reads 7.36GB from the drive. Now that's obviously not the whole 850GB... but it's not far removed from the number you get if you multiply the number of files on the source drive (1174000) by the block/cluster size (4096) â i.e. 4.81GB. Not sure how to account for the remaining 2.5GB though, if this were the case.
rsync backup ssd duplicity dejadup
 |Â
show 1 more comment
up vote
1
down vote
favorite
On a fresh Ubuntu 18.04 system I decided to give Deja-Dup a go. It's just a GUI for Duplicity (which uses rsync). About 850GB of data needed to be backed up. Source SSD was NVMe. Destination SSD was SATA. Initial (full) backup took about 7 hours.
The next day I did nothing more than check mail and install an application â which added ~230MB â before running Deja-Dup again.
Deja-Dup ran for ~39 minutes loading a single core at 35âÂÂ70% for the entire duration.
Duplicity was invoked three times:
- The Scanning phase pegged a core at 100% for 18 minutes.
- The Backing Up phase pegged a core at 100% for 16 minutes.
- The Verifying phase pegged a core at 100% for 5 minutes.
This is on a new computer with plenty of RAM. The backup was not encrypted.
Now, I expect initial (full) backups to take a while, and that's fine. What I don't expect is for ~230MB of new data to take ~39 minutes to be backed up (and consume over a collective core-hour of CPU time).
Is something wrong or broken? Should incremental backups of a couple of hundred megabytes take that much time to perform? I was expecting something under 5 minutes, not ~39. Why is it taking so long?
(If I had a spare 1TB SATA SSD I'd hook it up and just rsync the data straight across to see if that is noticeably faster â but unfortunately I do not.)
Update 1: I ran a manual backup after negligible changes (a few KB of new mail) and the time taken was the same (~39 minutes). Thus the time taken seems to have little to do with the amount of new data that needs to be backed up.
Update 2: Monitoring with iotop revealed that the Scanning phase reads 7.36GB from the drive. Now that's obviously not the whole 850GB... but it's not far removed from the number you get if you multiply the number of files on the source drive (1174000) by the block/cluster size (4096) â i.e. 4.81GB. Not sure how to account for the remaining 2.5GB though, if this were the case.
rsync backup ssd duplicity dejadup
Perhaps it made a 2nd full backup? I've never used the gui. Try looking iin the backup directory to see what new files it created. I think you should see files ` duplicity-inc.*` for the incremental backup.
â meuh
Apr 29 at 8:33
Yep, I've got the duplicity.inc files for today and they seem to be about the right size (~15% compression). The rest is dated yesterday. The destination SSD is only 1TB so it couldn't fit two full backups without overwriting (parts of) the first anyway. So it looks like the programs did the job correctly. I'm just concerned about how long it took them to do it.
â Tim
Apr 29 at 8:49
1
I'm not sure how duplicity works when doing a comparison. Whereasrsync
typically is configured to just compare timestamps and sizes, I think duplicity recomputes a "signature" for each file, which involves reading the whole file and comparing it with the saved signature in the backup. If so, the scan/backup is having to read all 850Gby of your disk to collect the signatures.
â meuh
Apr 29 at 9:14
If that's the case, then won't the atime on every file be modified and, because they're on SSD, won't that then trigger the rewriting of all 850GB in the process? (Or at least the first inode of each file, of which there's ~1.1 million?)
â Tim
Apr 29 at 20:02
Linux has optimised atime, seeman 8 mount
for relatime etc. Also, some backup programs will explicitly callutimes()
to restore the atime after reading the file. I don't know what duplicity does.
â meuh
Apr 29 at 20:38
 |Â
show 1 more comment
up vote
1
down vote
favorite
up vote
1
down vote
favorite
On a fresh Ubuntu 18.04 system I decided to give Deja-Dup a go. It's just a GUI for Duplicity (which uses rsync). About 850GB of data needed to be backed up. Source SSD was NVMe. Destination SSD was SATA. Initial (full) backup took about 7 hours.
The next day I did nothing more than check mail and install an application â which added ~230MB â before running Deja-Dup again.
Deja-Dup ran for ~39 minutes loading a single core at 35âÂÂ70% for the entire duration.
Duplicity was invoked three times:
- The Scanning phase pegged a core at 100% for 18 minutes.
- The Backing Up phase pegged a core at 100% for 16 minutes.
- The Verifying phase pegged a core at 100% for 5 minutes.
This is on a new computer with plenty of RAM. The backup was not encrypted.
Now, I expect initial (full) backups to take a while, and that's fine. What I don't expect is for ~230MB of new data to take ~39 minutes to be backed up (and consume over a collective core-hour of CPU time).
Is something wrong or broken? Should incremental backups of a couple of hundred megabytes take that much time to perform? I was expecting something under 5 minutes, not ~39. Why is it taking so long?
(If I had a spare 1TB SATA SSD I'd hook it up and just rsync the data straight across to see if that is noticeably faster â but unfortunately I do not.)
Update 1: I ran a manual backup after negligible changes (a few KB of new mail) and the time taken was the same (~39 minutes). Thus the time taken seems to have little to do with the amount of new data that needs to be backed up.
Update 2: Monitoring with iotop revealed that the Scanning phase reads 7.36GB from the drive. Now that's obviously not the whole 850GB... but it's not far removed from the number you get if you multiply the number of files on the source drive (1174000) by the block/cluster size (4096) â i.e. 4.81GB. Not sure how to account for the remaining 2.5GB though, if this were the case.
rsync backup ssd duplicity dejadup
On a fresh Ubuntu 18.04 system I decided to give Deja-Dup a go. It's just a GUI for Duplicity (which uses rsync). About 850GB of data needed to be backed up. Source SSD was NVMe. Destination SSD was SATA. Initial (full) backup took about 7 hours.
The next day I did nothing more than check mail and install an application â which added ~230MB â before running Deja-Dup again.
Deja-Dup ran for ~39 minutes loading a single core at 35âÂÂ70% for the entire duration.
Duplicity was invoked three times:
- The Scanning phase pegged a core at 100% for 18 minutes.
- The Backing Up phase pegged a core at 100% for 16 minutes.
- The Verifying phase pegged a core at 100% for 5 minutes.
This is on a new computer with plenty of RAM. The backup was not encrypted.
Now, I expect initial (full) backups to take a while, and that's fine. What I don't expect is for ~230MB of new data to take ~39 minutes to be backed up (and consume over a collective core-hour of CPU time).
Is something wrong or broken? Should incremental backups of a couple of hundred megabytes take that much time to perform? I was expecting something under 5 minutes, not ~39. Why is it taking so long?
(If I had a spare 1TB SATA SSD I'd hook it up and just rsync the data straight across to see if that is noticeably faster â but unfortunately I do not.)
Update 1: I ran a manual backup after negligible changes (a few KB of new mail) and the time taken was the same (~39 minutes). Thus the time taken seems to have little to do with the amount of new data that needs to be backed up.
Update 2: Monitoring with iotop revealed that the Scanning phase reads 7.36GB from the drive. Now that's obviously not the whole 850GB... but it's not far removed from the number you get if you multiply the number of files on the source drive (1174000) by the block/cluster size (4096) â i.e. 4.81GB. Not sure how to account for the remaining 2.5GB though, if this were the case.
rsync backup ssd duplicity dejadup
edited Apr 30 at 3:14
asked Apr 29 at 8:10
Tim
28128
28128
Perhaps it made a 2nd full backup? I've never used the gui. Try looking iin the backup directory to see what new files it created. I think you should see files ` duplicity-inc.*` for the incremental backup.
â meuh
Apr 29 at 8:33
Yep, I've got the duplicity.inc files for today and they seem to be about the right size (~15% compression). The rest is dated yesterday. The destination SSD is only 1TB so it couldn't fit two full backups without overwriting (parts of) the first anyway. So it looks like the programs did the job correctly. I'm just concerned about how long it took them to do it.
â Tim
Apr 29 at 8:49
1
I'm not sure how duplicity works when doing a comparison. Whereasrsync
typically is configured to just compare timestamps and sizes, I think duplicity recomputes a "signature" for each file, which involves reading the whole file and comparing it with the saved signature in the backup. If so, the scan/backup is having to read all 850Gby of your disk to collect the signatures.
â meuh
Apr 29 at 9:14
If that's the case, then won't the atime on every file be modified and, because they're on SSD, won't that then trigger the rewriting of all 850GB in the process? (Or at least the first inode of each file, of which there's ~1.1 million?)
â Tim
Apr 29 at 20:02
Linux has optimised atime, seeman 8 mount
for relatime etc. Also, some backup programs will explicitly callutimes()
to restore the atime after reading the file. I don't know what duplicity does.
â meuh
Apr 29 at 20:38
 |Â
show 1 more comment
Perhaps it made a 2nd full backup? I've never used the gui. Try looking iin the backup directory to see what new files it created. I think you should see files ` duplicity-inc.*` for the incremental backup.
â meuh
Apr 29 at 8:33
Yep, I've got the duplicity.inc files for today and they seem to be about the right size (~15% compression). The rest is dated yesterday. The destination SSD is only 1TB so it couldn't fit two full backups without overwriting (parts of) the first anyway. So it looks like the programs did the job correctly. I'm just concerned about how long it took them to do it.
â Tim
Apr 29 at 8:49
1
I'm not sure how duplicity works when doing a comparison. Whereasrsync
typically is configured to just compare timestamps and sizes, I think duplicity recomputes a "signature" for each file, which involves reading the whole file and comparing it with the saved signature in the backup. If so, the scan/backup is having to read all 850Gby of your disk to collect the signatures.
â meuh
Apr 29 at 9:14
If that's the case, then won't the atime on every file be modified and, because they're on SSD, won't that then trigger the rewriting of all 850GB in the process? (Or at least the first inode of each file, of which there's ~1.1 million?)
â Tim
Apr 29 at 20:02
Linux has optimised atime, seeman 8 mount
for relatime etc. Also, some backup programs will explicitly callutimes()
to restore the atime after reading the file. I don't know what duplicity does.
â meuh
Apr 29 at 20:38
Perhaps it made a 2nd full backup? I've never used the gui. Try looking iin the backup directory to see what new files it created. I think you should see files ` duplicity-inc.*` for the incremental backup.
â meuh
Apr 29 at 8:33
Perhaps it made a 2nd full backup? I've never used the gui. Try looking iin the backup directory to see what new files it created. I think you should see files ` duplicity-inc.*` for the incremental backup.
â meuh
Apr 29 at 8:33
Yep, I've got the duplicity.inc files for today and they seem to be about the right size (~15% compression). The rest is dated yesterday. The destination SSD is only 1TB so it couldn't fit two full backups without overwriting (parts of) the first anyway. So it looks like the programs did the job correctly. I'm just concerned about how long it took them to do it.
â Tim
Apr 29 at 8:49
Yep, I've got the duplicity.inc files for today and they seem to be about the right size (~15% compression). The rest is dated yesterday. The destination SSD is only 1TB so it couldn't fit two full backups without overwriting (parts of) the first anyway. So it looks like the programs did the job correctly. I'm just concerned about how long it took them to do it.
â Tim
Apr 29 at 8:49
1
1
I'm not sure how duplicity works when doing a comparison. Whereas
rsync
typically is configured to just compare timestamps and sizes, I think duplicity recomputes a "signature" for each file, which involves reading the whole file and comparing it with the saved signature in the backup. If so, the scan/backup is having to read all 850Gby of your disk to collect the signatures.â meuh
Apr 29 at 9:14
I'm not sure how duplicity works when doing a comparison. Whereas
rsync
typically is configured to just compare timestamps and sizes, I think duplicity recomputes a "signature" for each file, which involves reading the whole file and comparing it with the saved signature in the backup. If so, the scan/backup is having to read all 850Gby of your disk to collect the signatures.â meuh
Apr 29 at 9:14
If that's the case, then won't the atime on every file be modified and, because they're on SSD, won't that then trigger the rewriting of all 850GB in the process? (Or at least the first inode of each file, of which there's ~1.1 million?)
â Tim
Apr 29 at 20:02
If that's the case, then won't the atime on every file be modified and, because they're on SSD, won't that then trigger the rewriting of all 850GB in the process? (Or at least the first inode of each file, of which there's ~1.1 million?)
â Tim
Apr 29 at 20:02
Linux has optimised atime, see
man 8 mount
for relatime etc. Also, some backup programs will explicitly call utimes()
to restore the atime after reading the file. I don't know what duplicity does.â meuh
Apr 29 at 20:38
Linux has optimised atime, see
man 8 mount
for relatime etc. Also, some backup programs will explicitly call utimes()
to restore the atime after reading the file. I don't know what duplicity does.â meuh
Apr 29 at 20:38
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
I can confirm that the Deja-Dup/Duplicity combination just makes the backing up process atrociously slow (~156x slower, in fact).
I ended up wiping the Deja-Dup/Duplicity backup and just went with pure rsync. Backups are now taking as little as 15 seconds.
$ rsync -a --delete /home/tim /media/tim/BackupDrive/
Unless you really, really, really need the simplicity that Deja-Dup provides or the features that Duplicity provides then don't even bother â you're just wasting CPU cycles, time, electricity, causing unnecessary wear on your drive, and you end up with a fragile backup. Deja-Dup/Duplicity is a horribly inefficient way to simply back up files from one local drive to another.
For simple, automated, local backups all you need is to add a rsync entry to your crontab.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
I can confirm that the Deja-Dup/Duplicity combination just makes the backing up process atrociously slow (~156x slower, in fact).
I ended up wiping the Deja-Dup/Duplicity backup and just went with pure rsync. Backups are now taking as little as 15 seconds.
$ rsync -a --delete /home/tim /media/tim/BackupDrive/
Unless you really, really, really need the simplicity that Deja-Dup provides or the features that Duplicity provides then don't even bother â you're just wasting CPU cycles, time, electricity, causing unnecessary wear on your drive, and you end up with a fragile backup. Deja-Dup/Duplicity is a horribly inefficient way to simply back up files from one local drive to another.
For simple, automated, local backups all you need is to add a rsync entry to your crontab.
add a comment |Â
up vote
1
down vote
accepted
I can confirm that the Deja-Dup/Duplicity combination just makes the backing up process atrociously slow (~156x slower, in fact).
I ended up wiping the Deja-Dup/Duplicity backup and just went with pure rsync. Backups are now taking as little as 15 seconds.
$ rsync -a --delete /home/tim /media/tim/BackupDrive/
Unless you really, really, really need the simplicity that Deja-Dup provides or the features that Duplicity provides then don't even bother â you're just wasting CPU cycles, time, electricity, causing unnecessary wear on your drive, and you end up with a fragile backup. Deja-Dup/Duplicity is a horribly inefficient way to simply back up files from one local drive to another.
For simple, automated, local backups all you need is to add a rsync entry to your crontab.
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
I can confirm that the Deja-Dup/Duplicity combination just makes the backing up process atrociously slow (~156x slower, in fact).
I ended up wiping the Deja-Dup/Duplicity backup and just went with pure rsync. Backups are now taking as little as 15 seconds.
$ rsync -a --delete /home/tim /media/tim/BackupDrive/
Unless you really, really, really need the simplicity that Deja-Dup provides or the features that Duplicity provides then don't even bother â you're just wasting CPU cycles, time, electricity, causing unnecessary wear on your drive, and you end up with a fragile backup. Deja-Dup/Duplicity is a horribly inefficient way to simply back up files from one local drive to another.
For simple, automated, local backups all you need is to add a rsync entry to your crontab.
I can confirm that the Deja-Dup/Duplicity combination just makes the backing up process atrociously slow (~156x slower, in fact).
I ended up wiping the Deja-Dup/Duplicity backup and just went with pure rsync. Backups are now taking as little as 15 seconds.
$ rsync -a --delete /home/tim /media/tim/BackupDrive/
Unless you really, really, really need the simplicity that Deja-Dup provides or the features that Duplicity provides then don't even bother â you're just wasting CPU cycles, time, electricity, causing unnecessary wear on your drive, and you end up with a fragile backup. Deja-Dup/Duplicity is a horribly inefficient way to simply back up files from one local drive to another.
For simple, automated, local backups all you need is to add a rsync entry to your crontab.
edited May 27 at 15:42
answered May 27 at 15:36
Tim
28128
28128
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f440697%2fbacking-up-to-a-local-ssd-taking-a-lot-longer-than-expected%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Perhaps it made a 2nd full backup? I've never used the gui. Try looking iin the backup directory to see what new files it created. I think you should see files ` duplicity-inc.*` for the incremental backup.
â meuh
Apr 29 at 8:33
Yep, I've got the duplicity.inc files for today and they seem to be about the right size (~15% compression). The rest is dated yesterday. The destination SSD is only 1TB so it couldn't fit two full backups without overwriting (parts of) the first anyway. So it looks like the programs did the job correctly. I'm just concerned about how long it took them to do it.
â Tim
Apr 29 at 8:49
1
I'm not sure how duplicity works when doing a comparison. Whereas
rsync
typically is configured to just compare timestamps and sizes, I think duplicity recomputes a "signature" for each file, which involves reading the whole file and comparing it with the saved signature in the backup. If so, the scan/backup is having to read all 850Gby of your disk to collect the signatures.â meuh
Apr 29 at 9:14
If that's the case, then won't the atime on every file be modified and, because they're on SSD, won't that then trigger the rewriting of all 850GB in the process? (Or at least the first inode of each file, of which there's ~1.1 million?)
â Tim
Apr 29 at 20:02
Linux has optimised atime, see
man 8 mount
for relatime etc. Also, some backup programs will explicitly callutimes()
to restore the atime after reading the file. I don't know what duplicity does.â meuh
Apr 29 at 20:38