mjhjmtu

Question

This is a situation I am frequently in:

I have a source server with a 320GB hard-drive inside of it, and 16GB of ram (exact specs available here, but as this is an issue I run into frequently on other machines as well, I would prefer the answer to work on any "reasonable" Linux machine)

I have a backup server with several terabytes of hard-drive space (exact specs here, see disclaimer above)

I want to transfer 320GB of data from the source server to the target server (specifically, the data from /dev/sda).

The two computers are physically next to each other, so I can run cables between them.

I'm on a LAN, and I'm using a new-ish router, which means my network speeds should "ideally" be 1000Mbit, right?

Security is not an issue. I am on a local network, and I trust all machines on the network, including the router.

(optional) I don't necessarily need a signed checksum of the data, but basic error checking (such as dropped packets, or the drive becoming unreadable) should be detected rather than just disappear into the output.

I searched for this question online, and have tested several commands. The one that appears the most often is this:

ssh user@192.168.1.100 'dd bs=16M if=/dev/sda | gzip' > backup_sda.gz

This command has proven too slow (it ran for an hour, only got about 80GB through the data). It took about 1 minute and 22 seconds for the 1GB test packet, and ended up being twice as fast when not compressed. The results may also have been skewed by the fact that the transferred file is less than the amount of RAM on the source system.

Moreover (and this was tested on 1GB test pieces), I'm getting issues if I use the gzip command and dd; the resulting file has a different checksum when extracted on the target, than it does if piped directly. I'm still trying to figure out why this is happening.

Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded? — Sep 7 '15 at 5:48
Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations. — Sep 7 '15 at 13:09
"Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two? — Sep 8 '15 at 21:19
A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck". — Sep 10 '15 at 2:04

BlueRaja - Danny Pflughoeft 1,207166 · Answer 1 · 2015-09-07 05:51:01Z

up vote
136
down vote

Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.

answered Sep 7 '15 at 5:51

BlueRaja - Danny Pflughoeft

1,207166

15

+1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about Â£40, and you've probably spent that much in time already,
â€“Â deworde
Sep 7 '15 at 10:06

3

I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
â€“Â Phizes
Sep 8 '15 at 1:52

3

@Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
â€“Â Peter Cordes
Sep 9 '15 at 21:12

10

"Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
â€“Â Kevin
Sep 11 '15 at 4:31

3

@Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
â€“Â Peter Cordes
Sep 13 '15 at 3:59

Â |Â
show 9 more comments

score 69 · Answer 2 · 2015-09-07 04:29:33Z

up vote
69
down vote

netcat is great for situations like this where security is not an issue:

# on destination machine, create listener on port 9999
nc -l 9999 > /path/to/outfile

# on source machine, send to destination:9999
nc destination_host_or_ip 9999 < /dev/sda
# or dd if=/dev/sda | nc destination_host_or_ip 9999

Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.

pv is even more helpful in reporting progress during the copy:

# on destination
nc -l 9999 | pv > /path/to/outfile

# on source
pv /dev/sda | nc destination_host_or_ip 9999
# or dd if=/dev/sda | pv | nc destination_host_or_ip 9999

edited Sep 7 '15 at 4:29

answered Sep 7 '15 at 4:00

zackse

1,17849

2

For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
â€“Â IQAndreas
Sep 7 '15 at 4:07

5

@user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
â€“Â IQAndreas
Sep 7 '15 at 6:40

16

DonÃ¢Â€Â™t forget that in bash one can use > /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port and < /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port redirections instead of piping to and from netcat respectively.
â€“Â Incnis Mrsi
Sep 7 '15 at 16:41

5

Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
â€“Â StÃ©phane Gourichon
Sep 8 '15 at 9:08

3

Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
â€“Â David Costa
Sep 8 '15 at 13:07

Â |Â
show 9 more comments

score 33 · Answer 3 · 2016-10-18 06:08:33Z

Do use fast compression.
- Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.
- Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.
- Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.
- lz4 is your best option here:
  
  LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.

Preferably do not unnecessarily seek.
- This can be difficult to gauge.
- If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:
```
</dev/zero tee >empty empty1 empty2; sync; rm empty*
```
- But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:
```
</dev/source_device lz4 | ...
```
- However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.
```
pax -r /source/tree[12] | lz4 | ...
mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...
```

Do not encrypt with ssh.
- Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.
- The PRNG needs the read data, or at least some of it, to sustain randomness.
- And of course you need to transfer the data as well.
- You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.
- And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:
```
### on tgt machine...
nc -l 9999 > out.lz4
### then on src machine...
... lz4 | nc tgt.local 9999
```

Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary. — Sep 9 '15 at 8:39
@EngineerDollery - yes, that was dumb. I think it is better, — Sep 10 '15 at 8:17
@IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer. — Sep 10 '15 at 13:38
Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format. — Sep 10 '15 at 21:50
@WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem. — Sep 10 '15 at 22:13

muru 33.9k578147 · Answer 4 · 2015-09-07 06:31:52Z

There are several limitations that could be limiting the transfer speed.

There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.

Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.

There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?

What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?

I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.

With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.

Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.

+1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times. — Sep 7 '15 at 8:23
> Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk. — Sep 7 '15 at 15:38
@Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed. — Sep 7 '15 at 17:07
40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says. — Sep 10 '15 at 18:00
Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive. — Sep 10 '15 at 20:54

score 16 · Answer 5 · 2015-09-10 14:09:20Z

We deal with this regularly.

The two main methods we tend to use are:

SATA/eSATA/sneakernet

Direct NFS mount, then local cp or rsync

The first is dependent on whether the drive can be physically relocated. This is not always the case.

The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.

NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).

EDIT

I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.

FUTURE PROOFING

This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).

I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.

But...

Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.

@jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes. — Sep 8 '15 at 0:42
lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU. — Sep 9 '15 at 21:34
@PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial. — Sep 10 '15 at 4:39
Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums. — Sep 10 '15 at 4:44
I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules. — Sep 10 '15 at 10:31

DarkHeart 3,38822137 · Answer 6 · 2015-09-07 05:16:41Z

A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .

See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm

I've had very fast transfer speeds with this tool.

The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links. — Sep 8 '15 at 9:20

A.L 4533617 · Answer 7 · 2015-09-11 17:51:02Z

If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:

rsync -varzP sourceFiles destination

Options are: verbose, archive mode, recursive, compress, Partial progress

Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant. — Sep 9 '15 at 20:04
Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression. — Jan 14 '16 at 20:22

A.L 4533617 · Answer 8 · 2015-09-09 17:54:56Z

Added on insistence of the original poster in comments to zackseÃ¢Â€Â™s answer, although IÃ¢Â€Â™m not sure it is the fastest in typical circumstances.

bash has a special redirection syntax:

For output:Ã¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€Âƒ> /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port

For input: Ã¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€Âƒ< /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port
IP ban be either dotted-decimal IP or a hostname;
port ban be either a decimal number or a port name from /etc/services.

There is no actual /dev/tcp/ directory. ItÃ¢Â€Â™s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).

Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.

Notes about netcat

There is an inconsistency in syntax between classical netcat and GNU netcat. IÃ¢Â€Â™ll use the classical syntax IÃ¢Â€Â™m accustomed to. Replace -lp with -l for GNU netcat.

Also, IÃ¢Â€Â™m unsure whether does GNU netcat accept -q switch.

Transferring a disk image

(Along the lines of zackseÃ¢Â€Â™s answer.)

On destination:

nc -lp 9999 >disk_image

On source:

dd if=/dev/sda >/dev/tcp/destination/9999
Ã¢Â€Âƒ

Creating a tar.gz archive, with `tar`

On destination:

nc -lp 9999 >backup.tgz

On source:

tar cz files or directories to be transferred >/dev/tcp/destination/9999

Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.

Transferring with immediate expansion to file system

Also with tar.

On destination:

cd backups
tar x </dev/tcp/destination/9999

On source:

tar c files or directories to be transferred |nc -q 1 -lp 9999

It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e.Ã¢Â€Â¯g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeservÃ¢Â€Â™s answer for details about compression.

Alternative style (the destination listens port)

On destination:

cd backups
nc -lp 9999 |tar x

On source:

tar c files or directories to be transferred >/dev/tcp/destination/9999

bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/â€¦ so you'd have to use something else for at least one half of the connection... — Sep 27 at 22:07

Brandon Xavier 25617 · Answer 9 · 2015-09-07 22:20:03Z

Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.

score 2 · Answer 10 · 2015-09-07 09:18:01Z

I would use this script I wrote that needs the socat package.

On the source machine:

tarnet -d wherefilesaretosend pass=none 12345 .

On the target machine:

tarnet -d wherefilesaretogo pass=none sourceip/12345

If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
The pass= option can be used where the data might be exposed (slower).

Edit:

Use the -n option to disable compression, if CPU is a bottle neck.

user133111 211 · Answer 11 · 2015-09-08 00:19:54Z

If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!

This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.

Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?

coteyr 3,415821 · Answer 12 · 2015-09-07 08:27:48Z

If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.

My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.

If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.

A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.

That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).

Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster". — Sep 7 '15 at 10:07
Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.) — Sep 7 '15 at 12:07
Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination. — Sep 7 '15 at 12:10
HD to RAM to RAM to HD is faster then HD to HD How can it be quicker? — Sep 11 '15 at 17:36

Mike Ciaraldi 192 · Answer 13 · 2015-09-07 21:58:01Z

Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.

Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.

Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect? — Sep 10 '15 at 13:22
I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling. — Sep 10 '15 at 17:42
This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems. — Sep 10 '15 at 17:45

Byron Jones 1192 · Answer 14 · 2015-09-08 19:18:48Z

up vote
1
down vote

I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.

answered Sep 8 '15 at 19:18

Byron Jones

1192

If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
â€“Â STW
Sep 9 '15 at 21:04

@STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
â€“Â Peter Cordes
Sep 10 '15 at 6:03

is there a specific name for this feature in switches?
â€“Â STW
Sep 10 '15 at 13:21

There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
â€“Â Byron Jones
Sep 11 '15 at 20:18

802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
â€“Â Dan Pritts
Sep 14 '15 at 14:02

add a commentÂ |Â

muru 33.9k578147 · Answer 15 · 2015-09-09 21:00:56Z

FWIW, I've always used this:

tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"

Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
(Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)

Just tested this between two busy servers and managed ~14GB in 216s
(about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV

$ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
Wed Sep 9 15:23:37 EDT 2015
Wed Sep 9 15:27:13 EDT 2015

$ du -s Installers
14211072 Installers

Peter Cordes 4,1121032 · Answer 16 · 2015-09-09 21:47:33Z

Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).

The restore target can be a file on the target system.

If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.

I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.

Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.

user133526 111 · Answer 17 · 2015-09-10 09:29:19Z

You could also setup the systems to have a shared storage!

I am considering that these are next to each other, and you are likely to do this again & again ....

score 1 · Answer 18 · 2015-09-11 19:56:21Z

How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.

Here's a similar question with some examples of that kind of solution.

Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.

To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0

Good luck!

Edit:

@Khrystoph touched on this in his answer

How will it improve speed rates? Can you please explain it your answer? — Sep 11 '15 at 17:43
It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb. — Sep 11 '15 at 20:47

score 1 · Answer 19 · 2015-09-14 13:55:41Z

Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.

If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.

score 0 · Answer 20 · 2018-10-02 15:55:27Z

OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.

The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.

performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):

$ time bbcp 10G root@$dest_ip:/dev/null
0m16.5s 

iperf:

server: $ iperf3 -s -F /dev/null
client:
$ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
0m13.44s
(30% cpu)

netcat (1.187 openbsd):

server: $ nc -l 1234 > /dev/null
client: $ time nc $dest_ip 1234 -q 0 < 10G 
0m13.311s
(58% cpu)

scp:

$ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
1m31.616s
scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly): 
1m32.707s

socat:

server:
$ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
client:
$ time socat -u FILE:10G TCP:$dest_ip:9876
0m15.989s

And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:

nc: 0m18.706s (100% cpu, v1.84, no -q option
iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
socat: 0m10.293s (88% cpu, possibly maxed out)

So on one instance netcat used less cpu, on the other socat, so YMMV.

With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.

What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.

Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.

Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)

Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:

http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/

suggestions from linode, add to /etc/sysctl.conf:

net.core.rmem_max = 268435456 
net.core.wmem_max = 268435456 
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.core.netdev_max_backlog = 250000
net.ipv4.tcp_no_metrics_save = 1
net.core.default_qdisc = fq

Additionally, they would like you to run:

 /sbin/ifconfig eth0 txqueuelen 10000

worth double checking after tweaking to make sure changes don't cause harm too.

Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp

With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.

The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.

BlueRaja - Danny Pflughoeft 1,207166 · Answer 21 · 2015-09-07 05:51:01Z

up vote
136
down vote

Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.

answered Sep 7 '15 at 5:51

BlueRaja - Danny Pflughoeft

1,207166

15

+1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about Â£40, and you've probably spent that much in time already,
â€“Â deworde
Sep 7 '15 at 10:06

3

I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
â€“Â Phizes
Sep 8 '15 at 1:52

3

@Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
â€“Â Peter Cordes
Sep 9 '15 at 21:12

10

"Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
â€“Â Kevin
Sep 11 '15 at 4:31

3

@Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
â€“Â Peter Cordes
Sep 13 '15 at 3:59

Â |Â
show 9 more comments

score 69 · Answer 22 · 2015-09-07 04:29:33Z

up vote
69
down vote

netcat is great for situations like this where security is not an issue:

# on destination machine, create listener on port 9999
nc -l 9999 > /path/to/outfile

# on source machine, send to destination:9999
nc destination_host_or_ip 9999 < /dev/sda
# or dd if=/dev/sda | nc destination_host_or_ip 9999

Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.

pv is even more helpful in reporting progress during the copy:

# on destination
nc -l 9999 | pv > /path/to/outfile

# on source
pv /dev/sda | nc destination_host_or_ip 9999
# or dd if=/dev/sda | pv | nc destination_host_or_ip 9999

edited Sep 7 '15 at 4:29

answered Sep 7 '15 at 4:00

zackse

1,17849

2

For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
â€“Â IQAndreas
Sep 7 '15 at 4:07

5

@user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
â€“Â IQAndreas
Sep 7 '15 at 6:40

16

DonÃ¢Â€Â™t forget that in bash one can use > /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port and < /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port redirections instead of piping to and from netcat respectively.
â€“Â Incnis Mrsi
Sep 7 '15 at 16:41

5

Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
â€“Â StÃ©phane Gourichon
Sep 8 '15 at 9:08

3

Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
â€“Â David Costa
Sep 8 '15 at 13:07

Â |Â
show 9 more comments

score 33 · Answer 23 · 2016-10-18 06:08:33Z

Do use fast compression.
- Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.
- Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.
- Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.
- lz4 is your best option here:
  
  LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.

Preferably do not unnecessarily seek.
- This can be difficult to gauge.
- If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:
```
</dev/zero tee >empty empty1 empty2; sync; rm empty*
```
- But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:
```
</dev/source_device lz4 | ...
```
- However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.
```
pax -r /source/tree[12] | lz4 | ...
mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...
```

Do not encrypt with ssh.
- Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.
- The PRNG needs the read data, or at least some of it, to sustain randomness.
- And of course you need to transfer the data as well.
- You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.
- And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:
```
### on tgt machine...
nc -l 9999 > out.lz4
### then on src machine...
... lz4 | nc tgt.local 9999
```

Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary. — Sep 9 '15 at 8:39
@EngineerDollery - yes, that was dumb. I think it is better, — Sep 10 '15 at 8:17
@IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer. — Sep 10 '15 at 13:38
Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format. — Sep 10 '15 at 21:50
@WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem. — Sep 10 '15 at 22:13

muru 33.9k578147 · Answer 24 · 2015-09-07 06:31:52Z

There are several limitations that could be limiting the transfer speed.

There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.

Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.

There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?

What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?

I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.

With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.

Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.

+1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times. — Sep 7 '15 at 8:23
> Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk. — Sep 7 '15 at 15:38
@Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed. — Sep 7 '15 at 17:07
40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says. — Sep 10 '15 at 18:00
Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive. — Sep 10 '15 at 20:54

score 16 · Answer 25 · 2015-09-10 14:09:20Z

We deal with this regularly.

The two main methods we tend to use are:

SATA/eSATA/sneakernet

Direct NFS mount, then local cp or rsync

The first is dependent on whether the drive can be physically relocated. This is not always the case.

The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.

NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).

EDIT

I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.

FUTURE PROOFING

This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).

I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.

But...

Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.

@jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes. — Sep 8 '15 at 0:42
lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU. — Sep 9 '15 at 21:34
@PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial. — Sep 10 '15 at 4:39
Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums. — Sep 10 '15 at 4:44
I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules. — Sep 10 '15 at 10:31

DarkHeart 3,38822137 · Answer 26 · 2015-09-07 05:16:41Z

A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .

See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm

I've had very fast transfer speeds with this tool.

The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links. — Sep 8 '15 at 9:20

A.L 4533617 · Answer 27 · 2015-09-11 17:51:02Z

If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:

rsync -varzP sourceFiles destination

Options are: verbose, archive mode, recursive, compress, Partial progress

Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant. — Sep 9 '15 at 20:04
Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression. — Jan 14 '16 at 20:22

A.L 4533617 · Answer 28 · 2015-09-09 17:54:56Z

Added on insistence of the original poster in comments to zackseÃ¢Â€Â™s answer, although IÃ¢Â€Â™m not sure it is the fastest in typical circumstances.

bash has a special redirection syntax:

For output:Ã¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€Âƒ> /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port

For input: Ã¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€ÂƒÃ¢Â€Âƒ< /dev/tcp/Ã¢Â€Â¯IPÃ¢Â€Â¯/Ã¢Â€Â¯port
IP ban be either dotted-decimal IP or a hostname;
port ban be either a decimal number or a port name from /etc/services.

There is no actual /dev/tcp/ directory. ItÃ¢Â€Â™s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).

Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.

Notes about netcat

There is an inconsistency in syntax between classical netcat and GNU netcat. IÃ¢Â€Â™ll use the classical syntax IÃ¢Â€Â™m accustomed to. Replace -lp with -l for GNU netcat.

Also, IÃ¢Â€Â™m unsure whether does GNU netcat accept -q switch.

Transferring a disk image

(Along the lines of zackseÃ¢Â€Â™s answer.)

On destination:

nc -lp 9999 >disk_image

On source:

dd if=/dev/sda >/dev/tcp/destination/9999
Ã¢Â€Âƒ

Creating a tar.gz archive, with `tar`

On destination:

nc -lp 9999 >backup.tgz

On source:

tar cz files or directories to be transferred >/dev/tcp/destination/9999

Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.

Transferring with immediate expansion to file system

Also with tar.

On destination:

cd backups
tar x </dev/tcp/destination/9999

On source:

tar c files or directories to be transferred |nc -q 1 -lp 9999

It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e.Ã¢Â€Â¯g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeservÃ¢Â€Â™s answer for details about compression.

Alternative style (the destination listens port)

On destination:

cd backups
nc -lp 9999 |tar x

On source:

tar c files or directories to be transferred >/dev/tcp/destination/9999

bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/â€¦ so you'd have to use something else for at least one half of the connection... — Sep 27 at 22:07

Brandon Xavier 25617 · Answer 29 · 2015-09-07 22:20:03Z

Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.

score 2 · Answer 30 · 2015-09-07 09:18:01Z

I would use this script I wrote that needs the socat package.

On the source machine:

tarnet -d wherefilesaretosend pass=none 12345 .

On the target machine:

tarnet -d wherefilesaretogo pass=none sourceip/12345

If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
The pass= option can be used where the data might be exposed (slower).

Edit:

Use the -n option to disable compression, if CPU is a bottle neck.

user133111 211 · Answer 31 · 2015-09-08 00:19:54Z

If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!

This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.

Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?

coteyr 3,415821 · Answer 32 · 2015-09-07 08:27:48Z

If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.

My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.

If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.

A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.

That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).

Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster". — Sep 7 '15 at 10:07
Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.) — Sep 7 '15 at 12:07
Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination. — Sep 7 '15 at 12:10
HD to RAM to RAM to HD is faster then HD to HD How can it be quicker? — Sep 11 '15 at 17:36

Mike Ciaraldi 192 · Answer 33 · 2015-09-07 21:58:01Z

Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.

Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.

Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect? — Sep 10 '15 at 13:22
I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling. — Sep 10 '15 at 17:42
This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems. — Sep 10 '15 at 17:45

Byron Jones 1192 · Answer 34 · 2015-09-08 19:18:48Z

up vote
1
down vote

I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.

answered Sep 8 '15 at 19:18

Byron Jones

1192

If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
â€“Â STW
Sep 9 '15 at 21:04

@STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
â€“Â Peter Cordes
Sep 10 '15 at 6:03

is there a specific name for this feature in switches?
â€“Â STW
Sep 10 '15 at 13:21

There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
â€“Â Byron Jones
Sep 11 '15 at 20:18

802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
â€“Â Dan Pritts
Sep 14 '15 at 14:02

add a commentÂ |Â

muru 33.9k578147 · Answer 35 · 2015-09-09 21:00:56Z

FWIW, I've always used this:

tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"

Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
(Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)

Just tested this between two busy servers and managed ~14GB in 216s
(about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV

$ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
Wed Sep 9 15:23:37 EDT 2015
Wed Sep 9 15:27:13 EDT 2015

$ du -s Installers
14211072 Installers

Peter Cordes 4,1121032 · Answer 36 · 2015-09-09 21:47:33Z

Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).

The restore target can be a file on the target system.

If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.

I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.

Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.

user133526 111 · Answer 37 · 2015-09-10 09:29:19Z

You could also setup the systems to have a shared storage!

I am considering that these are next to each other, and you are likely to do this again & again ....

score 1 · Answer 38 · 2015-09-11 19:56:21Z

How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.

Here's a similar question with some examples of that kind of solution.

Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.

To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0

Good luck!

Edit:

@Khrystoph touched on this in his answer

How will it improve speed rates? Can you please explain it your answer? — Sep 11 '15 at 17:43
It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb. — Sep 11 '15 at 20:47

score 1 · Answer 39 · 2015-09-14 13:55:41Z

Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.

If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.

score 0 · Answer 40 · 2018-10-02 15:55:27Z

OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.

The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.

performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):

$ time bbcp 10G root@$dest_ip:/dev/null
0m16.5s 

iperf:

server: $ iperf3 -s -F /dev/null
client:
$ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
0m13.44s
(30% cpu)

netcat (1.187 openbsd):

server: $ nc -l 1234 > /dev/null
client: $ time nc $dest_ip 1234 -q 0 < 10G 
0m13.311s
(58% cpu)

scp:

$ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
1m31.616s
scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly): 
1m32.707s

socat:

server:
$ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
client:
$ time socat -u FILE:10G TCP:$dest_ip:9876
0m15.989s

And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:

nc: 0m18.706s (100% cpu, v1.84, no -q option
iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
socat: 0m10.293s (88% cpu, possibly maxed out)

So on one instance netcat used less cpu, on the other socat, so YMMV.

With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.

What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.

Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.

Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)

Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:

http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/

suggestions from linode, add to /etc/sysctl.conf:

net.core.rmem_max = 268435456 
net.core.wmem_max = 268435456 
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.core.netdev_max_backlog = 250000
net.ipv4.tcp_no_metrics_save = 1
net.core.default_qdisc = fq

Additionally, they would like you to run:

 /sbin/ifconfig eth0 txqueuelen 10000

worth double checking after tweaking to make sure changes don't cause harm too.

Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp

With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.

The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.

搜尋此網誌

mjhjmtu

What is the fastest way to send massive amounts of data between two computers? [closed]

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

20 Answers
20

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with `tar`

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

20 Answers
20

20 Answers
20

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with `tar`

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with `tar`

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with `tar`

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with `tar`

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Popular posts from this blog

How to check contact read email or not when send email to Individual?

How many registers does an x86_64 CPU actually have?

Running qemu-guest-agent on windows server 2008

What is the fastest way to send massive amounts of data between two computers? [closed]

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

closed as primarily opinion-based by slmâ™¦ Sep 28 at 2:28

20 Answers 20

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with tar

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

20 Answers 20

20 Answers 20

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with tar

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with tar

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with tar

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Notes about netcat

Transferring a disk image

Creating a tar.gz archive, with tar

Transferring with immediate expansion to file system

Alternative style (the destination listens port)

Popular posts from this blog

How to check contact read email or not when send email to Individual?

How many registers does an x86_64 CPU actually have?

Running qemu-guest-agent on windows server 2008

20 Answers
20

Creating a tar.gz archive, with `tar`

20 Answers
20

20 Answers
20

Creating a tar.gz archive, with `tar`

Creating a tar.gz archive, with `tar`

Creating a tar.gz archive, with `tar`

Creating a tar.gz archive, with `tar`