What is the fastest way to send massive amounts of data between two computers? [closed]

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
107
down vote

favorite
45












This is a situation I am frequently in:



  • I have a source server with a 320GB hard-drive inside of it, and 16GB of ram (exact specs available here, but as this is an issue I run into frequently on other machines as well, I would prefer the answer to work on any "reasonable" Linux machine)

  • I have a backup server with several terabytes of hard-drive space (exact specs here, see disclaimer above)

I want to transfer 320GB of data from the source server to the target server (specifically, the data from /dev/sda).



  1. The two computers are physically next to each other, so I can run cables between them.

  2. I'm on a LAN, and I'm using a new-ish router, which means my network speeds should "ideally" be 1000Mbit, right?

  3. Security is not an issue. I am on a local network, and I trust all machines on the network, including the router.


  4. (optional) I don't necessarily need a signed checksum of the data, but basic error checking (such as dropped packets, or the drive becoming unreadable) should be detected rather than just disappear into the output.


I searched for this question online, and have tested several commands. The one that appears the most often is this:



ssh user@192.168.1.100 'dd bs=16M if=/dev/sda | gzip' > backup_sda.gz


This command has proven too slow (it ran for an hour, only got about 80GB through the data). It took about 1 minute and 22 seconds for the 1GB test packet, and ended up being twice as fast when not compressed. The results may also have been skewed by the fact that the transferred file is less than the amount of RAM on the source system.



Moreover (and this was tested on 1GB test pieces), I'm getting issues if I use the gzip command and dd; the resulting file has a different checksum when extracted on the target, than it does if piped directly. I'm still trying to figure out why this is happening.










share|improve this question















closed as primarily opinion-based by slm♦ Sep 28 at 2:28


Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 52




    Dont' forget sneakernet
    – gwillie
    Sep 7 '15 at 4:43






  • 4




    Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded?
    – Jodka Lemon
    Sep 7 '15 at 5:48






  • 14




    Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations.
    – blafasel
    Sep 7 '15 at 13:09







  • 7




    "Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two?
    – keshlam
    Sep 8 '15 at 21:19






  • 8




    A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck".
    – AMADANON Inc.
    Sep 10 '15 at 2:04














up vote
107
down vote

favorite
45












This is a situation I am frequently in:



  • I have a source server with a 320GB hard-drive inside of it, and 16GB of ram (exact specs available here, but as this is an issue I run into frequently on other machines as well, I would prefer the answer to work on any "reasonable" Linux machine)

  • I have a backup server with several terabytes of hard-drive space (exact specs here, see disclaimer above)

I want to transfer 320GB of data from the source server to the target server (specifically, the data from /dev/sda).



  1. The two computers are physically next to each other, so I can run cables between them.

  2. I'm on a LAN, and I'm using a new-ish router, which means my network speeds should "ideally" be 1000Mbit, right?

  3. Security is not an issue. I am on a local network, and I trust all machines on the network, including the router.


  4. (optional) I don't necessarily need a signed checksum of the data, but basic error checking (such as dropped packets, or the drive becoming unreadable) should be detected rather than just disappear into the output.


I searched for this question online, and have tested several commands. The one that appears the most often is this:



ssh user@192.168.1.100 'dd bs=16M if=/dev/sda | gzip' > backup_sda.gz


This command has proven too slow (it ran for an hour, only got about 80GB through the data). It took about 1 minute and 22 seconds for the 1GB test packet, and ended up being twice as fast when not compressed. The results may also have been skewed by the fact that the transferred file is less than the amount of RAM on the source system.



Moreover (and this was tested on 1GB test pieces), I'm getting issues if I use the gzip command and dd; the resulting file has a different checksum when extracted on the target, than it does if piped directly. I'm still trying to figure out why this is happening.










share|improve this question















closed as primarily opinion-based by slm♦ Sep 28 at 2:28


Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise. If this question can be reworded to fit the rules in the help center, please edit the question.










  • 52




    Dont' forget sneakernet
    – gwillie
    Sep 7 '15 at 4:43






  • 4




    Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded?
    – Jodka Lemon
    Sep 7 '15 at 5:48






  • 14




    Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations.
    – blafasel
    Sep 7 '15 at 13:09







  • 7




    "Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two?
    – keshlam
    Sep 8 '15 at 21:19






  • 8




    A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck".
    – AMADANON Inc.
    Sep 10 '15 at 2:04












up vote
107
down vote

favorite
45









up vote
107
down vote

favorite
45






45





This is a situation I am frequently in:



  • I have a source server with a 320GB hard-drive inside of it, and 16GB of ram (exact specs available here, but as this is an issue I run into frequently on other machines as well, I would prefer the answer to work on any "reasonable" Linux machine)

  • I have a backup server with several terabytes of hard-drive space (exact specs here, see disclaimer above)

I want to transfer 320GB of data from the source server to the target server (specifically, the data from /dev/sda).



  1. The two computers are physically next to each other, so I can run cables between them.

  2. I'm on a LAN, and I'm using a new-ish router, which means my network speeds should "ideally" be 1000Mbit, right?

  3. Security is not an issue. I am on a local network, and I trust all machines on the network, including the router.


  4. (optional) I don't necessarily need a signed checksum of the data, but basic error checking (such as dropped packets, or the drive becoming unreadable) should be detected rather than just disappear into the output.


I searched for this question online, and have tested several commands. The one that appears the most often is this:



ssh user@192.168.1.100 'dd bs=16M if=/dev/sda | gzip' > backup_sda.gz


This command has proven too slow (it ran for an hour, only got about 80GB through the data). It took about 1 minute and 22 seconds for the 1GB test packet, and ended up being twice as fast when not compressed. The results may also have been skewed by the fact that the transferred file is less than the amount of RAM on the source system.



Moreover (and this was tested on 1GB test pieces), I'm getting issues if I use the gzip command and dd; the resulting file has a different checksum when extracted on the target, than it does if piped directly. I'm still trying to figure out why this is happening.










share|improve this question















This is a situation I am frequently in:



  • I have a source server with a 320GB hard-drive inside of it, and 16GB of ram (exact specs available here, but as this is an issue I run into frequently on other machines as well, I would prefer the answer to work on any "reasonable" Linux machine)

  • I have a backup server with several terabytes of hard-drive space (exact specs here, see disclaimer above)

I want to transfer 320GB of data from the source server to the target server (specifically, the data from /dev/sda).



  1. The two computers are physically next to each other, so I can run cables between them.

  2. I'm on a LAN, and I'm using a new-ish router, which means my network speeds should "ideally" be 1000Mbit, right?

  3. Security is not an issue. I am on a local network, and I trust all machines on the network, including the router.


  4. (optional) I don't necessarily need a signed checksum of the data, but basic error checking (such as dropped packets, or the drive becoming unreadable) should be detected rather than just disappear into the output.


I searched for this question online, and have tested several commands. The one that appears the most often is this:



ssh user@192.168.1.100 'dd bs=16M if=/dev/sda | gzip' > backup_sda.gz


This command has proven too slow (it ran for an hour, only got about 80GB through the data). It took about 1 minute and 22 seconds for the 1GB test packet, and ended up being twice as fast when not compressed. The results may also have been skewed by the fact that the transferred file is less than the amount of RAM on the source system.



Moreover (and this was tested on 1GB test pieces), I'm getting issues if I use the gzip command and dd; the resulting file has a different checksum when extracted on the target, than it does if piped directly. I'm still trying to figure out why this is happening.







networking backup dd file-transfer






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 7 '15 at 10:46









Community♦

1




1










asked Sep 7 '15 at 3:44









IQAndreas

3,932123963




3,932123963




closed as primarily opinion-based by slm♦ Sep 28 at 2:28


Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise. If this question can be reworded to fit the rules in the help center, please edit the question.






closed as primarily opinion-based by slm♦ Sep 28 at 2:28


Many good questions generate some degree of opinion based on expert experience, but answers to this question will tend to be almost entirely based on opinions, rather than facts, references, or specific expertise. If this question can be reworded to fit the rules in the help center, please edit the question.









  • 52




    Dont' forget sneakernet
    – gwillie
    Sep 7 '15 at 4:43






  • 4




    Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded?
    – Jodka Lemon
    Sep 7 '15 at 5:48






  • 14




    Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations.
    – blafasel
    Sep 7 '15 at 13:09







  • 7




    "Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two?
    – keshlam
    Sep 8 '15 at 21:19






  • 8




    A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck".
    – AMADANON Inc.
    Sep 10 '15 at 2:04












  • 52




    Dont' forget sneakernet
    – gwillie
    Sep 7 '15 at 4:43






  • 4




    Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded?
    – Jodka Lemon
    Sep 7 '15 at 5:48






  • 14




    Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations.
    – blafasel
    Sep 7 '15 at 13:09







  • 7




    "Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two?
    – keshlam
    Sep 8 '15 at 21:19






  • 8




    A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck".
    – AMADANON Inc.
    Sep 10 '15 at 2:04







52




52




Dont' forget sneakernet
– gwillie
Sep 7 '15 at 4:43




Dont' forget sneakernet
– gwillie
Sep 7 '15 at 4:43




4




4




Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded?
– Jodka Lemon
Sep 7 '15 at 5:48




Do you want to transfer /dev/sda as an image or just the files. Why is rsync no option? Is /dev/sda mounted while you dded?
– Jodka Lemon
Sep 7 '15 at 5:48




14




14




Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations.
– blafasel
Sep 7 '15 at 13:09





Your performance data (1GB/80sec, 80GB/1h) match perfectly what we should expect on 100MBit. Check your hardware. ... and gerrit is right, 320GB may be large, but "massive amount of data" raises wrong expectations.
– blafasel
Sep 7 '15 at 13:09





7




7




"Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two?
– keshlam
Sep 8 '15 at 21:19




"Never underestimate the bandwidth of a freight train full of disks." .. Are you asking about throughput, latency, or some mix of the two?
– keshlam
Sep 8 '15 at 21:19




8




8




A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck".
– AMADANON Inc.
Sep 10 '15 at 2:04




A friend of mine always said: "Never underestimate the bandwidth of a pile of hard drives on a truck".
– AMADANON Inc.
Sep 10 '15 at 2:04










20 Answers
20






active

oldest

votes

















up vote
136
down vote













Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.






share|improve this answer
















  • 15




    +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
    – deworde
    Sep 7 '15 at 10:06






  • 3




    I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
    – Phizes
    Sep 8 '15 at 1:52






  • 3




    @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
    – Peter Cordes
    Sep 9 '15 at 21:12






  • 10




    "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
    – Kevin
    Sep 11 '15 at 4:31






  • 3




    @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
    – Peter Cordes
    Sep 13 '15 at 3:59

















up vote
69
down vote













netcat is great for situations like this where security is not an issue:



# on destination machine, create listener on port 9999
nc -l 9999 > /path/to/outfile

# on source machine, send to destination:9999
nc destination_host_or_ip 9999 < /dev/sda
# or dd if=/dev/sda | nc destination_host_or_ip 9999


Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.



pv is even more helpful in reporting progress during the copy:



# on destination
nc -l 9999 | pv > /path/to/outfile

# on source
pv /dev/sda | nc destination_host_or_ip 9999
# or dd if=/dev/sda | pv | nc destination_host_or_ip 9999





share|improve this answer


















  • 2




    For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
    – IQAndreas
    Sep 7 '15 at 4:07







  • 5




    @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
    – IQAndreas
    Sep 7 '15 at 6:40






  • 16




    Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
    – Incnis Mrsi
    Sep 7 '15 at 16:41







  • 5




    Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
    – Stéphane Gourichon
    Sep 8 '15 at 9:08






  • 3




    Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
    – David Costa
    Sep 8 '15 at 13:07

















up vote
33
down vote














  1. Do use fast compression.



    • Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.

    • Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.

    • Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.


    • lz4 is your best option here:


      LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.






  2. Preferably do not unnecessarily seek.



    • This can be difficult to gauge.


    • If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:



      </dev/zero tee >empty empty1 empty2; sync; rm empty*



    • But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:



      </dev/source_device lz4 | ...



    • However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.



      pax -r /source/tree[12] | lz4 | ...
      mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...




  3. Do not encrypt with ssh.



    • Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.

    • The PRNG needs the read data, or at least some of it, to sustain randomness.

    • And of course you need to transfer the data as well.

    • You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.


    • And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:



      ### on tgt machine...
      nc -l 9999 > out.lz4
      ### then on src machine...
      ... lz4 | nc tgt.local 9999







share|improve this answer


















  • 1




    Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
    – Engineer Dollery
    Sep 9 '15 at 8:39










  • @EngineerDollery - yes, that was dumb. I think it is better,
    – mikeserv
    Sep 10 '15 at 8:17










  • @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
    – Mike S
    Sep 10 '15 at 13:38










  • Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
    – Walter Tross
    Sep 10 '15 at 21:50











  • @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
    – mikeserv
    Sep 10 '15 at 22:13

















up vote
25
down vote













There are several limitations that could be limiting the transfer speed.



  1. There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.


  2. Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.


  3. There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?


  4. What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?


I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.



With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.



Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.






share|improve this answer


















  • 11




    +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
    – Skrrp
    Sep 7 '15 at 8:23






  • 4




    > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
    – Bob
    Sep 7 '15 at 15:38






  • 2




    @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
    – MSalters
    Sep 7 '15 at 17:07






  • 1




    40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
    – hobbs
    Sep 10 '15 at 18:00






  • 3




    Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
    – derobert
    Sep 10 '15 at 20:54


















up vote
16
down vote













We deal with this regularly.



The two main methods we tend to use are:



  1. SATA/eSATA/sneakernet

  2. Direct NFS mount, then local cp or rsync

The first is dependent on whether the drive can be physically relocated. This is not always the case.



The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.



NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).



EDIT



I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.



FUTURE PROOFING



This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).



I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.



But...



Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.






share|improve this answer


















  • 1




    @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
    – zxq9
    Sep 8 '15 at 0:42







  • 2




    lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
    – Peter Cordes
    Sep 9 '15 at 21:34











  • @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
    – zxq9
    Sep 10 '15 at 4:39






  • 1




    Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
    – zxq9
    Sep 10 '15 at 4:44






  • 2




    I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
    – Engineer Dollery
    Sep 10 '15 at 10:31


















up vote
6
down vote













A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .



See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm



I've had very fast transfer speeds with this tool.






share|improve this answer
















  • 1




    The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
    – Stéphane Gourichon
    Sep 8 '15 at 9:20

















up vote
5
down vote













If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:



rsync -varzP sourceFiles destination


Options are: verbose, archive mode, recursive, compress, Partial progress






share|improve this answer


















  • 2




    Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
    – Tanath
    Sep 9 '15 at 20:04










  • Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
    – lindhe
    Jan 14 '16 at 20:22

















up vote
4
down vote













Added on insistence of the original poster in comments to zackse’s answer, although I’m not sure it is the fastest in typical circumstances.



bash has a special redirection syntax:

For output:     > /dev/tcp/ IP / port

For input:      < /dev/tcp/ IP / port
IP ban be either dotted-decimal IP or a hostname;
port ban be either a decimal number or a port name from /etc/services.



There is no actual /dev/tcp/ directory. It’s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).



Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.



Notes about netcat



There is an inconsistency in syntax between classical netcat and GNU netcat. I’ll use the classical syntax I’m accustomed to. Replace -lp with -l for GNU netcat.



Also, I’m unsure whether does GNU netcat accept -q switch.



Transferring a disk image



(Along the lines of zackse’s answer.)

On destination:



nc -lp 9999 >disk_image


On source:



dd if=/dev/sda >/dev/tcp/destination/9999
 


Creating a tar.gz archive, with tar



On destination:



nc -lp 9999 >backup.tgz


On source:



tar cz files or directories to be transferred >/dev/tcp/destination/9999


Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.



Transferring with immediate expansion to file system



Also with tar.

On destination:



cd backups
tar x </dev/tcp/destination/9999


On source:



tar c files or directories to be transferred |nc -q 1 -lp 9999


It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e. g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeserv’s answer for details about compression.



Alternative style (the destination listens port)



On destination:



cd backups
nc -lp 9999 |tar x


On source:



tar c files or directories to be transferred >/dev/tcp/destination/9999





share|improve this answer






















  • bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
    – rogerdpack
    Sep 27 at 22:07

















up vote
3
down vote













Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.






share|improve this answer



























    up vote
    2
    down vote













    I would use this script I wrote that needs the socat package.



    On the source machine:



    tarnet -d wherefilesaretosend pass=none 12345 .


    On the target machine:



    tarnet -d wherefilesaretogo pass=none sourceip/12345


    If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
    The pass= option can be used where the data might be exposed (slower).



    Edit:



    Use the -n option to disable compression, if CPU is a bottle neck.






    share|improve this answer





























      up vote
      2
      down vote













      If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!



      This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.



      Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?






      share|improve this answer



























        up vote
        1
        down vote













        If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.



        My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.



        If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.



        A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.



        That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).






        share|improve this answer
















        • 1




          how does 400G of RAM speed up data transfer?
          – Skaperen
          Sep 7 '15 at 9:39










        • Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
          – MichaelS
          Sep 7 '15 at 10:07










        • Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
          – coteyr
          Sep 7 '15 at 12:07










        • Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
          – coteyr
          Sep 7 '15 at 12:10







        • 1




          HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
          – A.L
          Sep 11 '15 at 17:36

















        up vote
        1
        down vote













        Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.



        Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.






        share|improve this answer
















        • 2




          Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
          – STW
          Sep 10 '15 at 13:22










        • I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
          – Mike Ciaraldi
          Sep 10 '15 at 17:42










        • This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
          – Mike Ciaraldi
          Sep 10 '15 at 17:45

















        up vote
        1
        down vote













        I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.






        share|improve this answer




















        • If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
          – STW
          Sep 9 '15 at 21:04










        • @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
          – Peter Cordes
          Sep 10 '15 at 6:03











        • is there a specific name for this feature in switches?
          – STW
          Sep 10 '15 at 13:21










        • There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
          – Byron Jones
          Sep 11 '15 at 20:18










        • 802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
          – Dan Pritts
          Sep 14 '15 at 14:02

















        up vote
        1
        down vote













        FWIW, I've always used this:



        tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"


        Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
        (Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)



        Just tested this between two busy servers and managed ~14GB in 216s
        (about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV



        $ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
        Wed Sep 9 15:23:37 EDT 2015
        Wed Sep 9 15:27:13 EDT 2015

        $ du -s Installers
        14211072 Installers





        share|improve this answer





























          up vote
          1
          down vote













          Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).



          The restore target can be a file on the target system.



          If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.



          I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.



          Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.






          share|improve this answer



























            up vote
            1
            down vote













            You could also setup the systems to have a shared storage!



            I am considering that these are next to each other, and you are likely to do this again & again ....






            share|improve this answer



























              up vote
              1
              down vote













              How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.



              Here's a similar question with some examples of that kind of solution.



              Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.



              To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0



              Good luck!



              Edit:



              @Khrystoph touched on this in his answer






              share|improve this answer






















              • How will it improve speed rates? Can you please explain it your answer?
                – A.L
                Sep 11 '15 at 17:43







              • 1




                It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                – Dan Pritts
                Sep 11 '15 at 20:47

















              up vote
              1
              down vote













              Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.



              If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.






              share|improve this answer





























                up vote
                0
                down vote













                OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.



                The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.



                performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):



                $ time bbcp 10G root@$dest_ip:/dev/null
                0m16.5s

                iperf:

                server: $ iperf3 -s -F /dev/null
                client:
                $ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
                0m13.44s
                (30% cpu)

                netcat (1.187 openbsd):

                server: $ nc -l 1234 > /dev/null
                client: $ time nc $dest_ip 1234 -q 0 < 10G
                0m13.311s
                (58% cpu)

                scp:

                $ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
                1m31.616s
                scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly):
                1m32.707s

                socat:

                server:
                $ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
                client:
                $ time socat -u FILE:10G TCP:$dest_ip:9876
                0m15.989s


                And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:



                nc: 0m18.706s (100% cpu, v1.84, no -q option
                iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
                socat: 0m10.293s (88% cpu, possibly maxed out)


                So on one instance netcat used less cpu, on the other socat, so YMMV.



                With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.



                What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.



                Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.



                Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)



                Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:



                http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
                possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/



                suggestions from linode, add to /etc/sysctl.conf:



                net.core.rmem_max = 268435456 
                net.core.wmem_max = 268435456
                net.ipv4.tcp_rmem = 4096 87380 134217728
                net.ipv4.tcp_wmem = 4096 65536 134217728
                net.core.netdev_max_backlog = 250000
                net.ipv4.tcp_no_metrics_save = 1
                net.core.default_qdisc = fq


                Additionally, they would like you to run:



                 /sbin/ifconfig eth0 txqueuelen 10000 


                worth double checking after tweaking to make sure changes don't cause harm too.



                Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp



                With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.



                The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.






                share|improve this answer





























                  20 Answers
                  20






                  active

                  oldest

                  votes








                  20 Answers
                  20






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes








                  up vote
                  136
                  down vote













                  Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.






                  share|improve this answer
















                  • 15




                    +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
                    – deworde
                    Sep 7 '15 at 10:06






                  • 3




                    I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
                    – Phizes
                    Sep 8 '15 at 1:52






                  • 3




                    @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
                    – Peter Cordes
                    Sep 9 '15 at 21:12






                  • 10




                    "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
                    – Kevin
                    Sep 11 '15 at 4:31






                  • 3




                    @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
                    – Peter Cordes
                    Sep 13 '15 at 3:59














                  up vote
                  136
                  down vote













                  Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.






                  share|improve this answer
















                  • 15




                    +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
                    – deworde
                    Sep 7 '15 at 10:06






                  • 3




                    I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
                    – Phizes
                    Sep 8 '15 at 1:52






                  • 3




                    @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
                    – Peter Cordes
                    Sep 9 '15 at 21:12






                  • 10




                    "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
                    – Kevin
                    Sep 11 '15 at 4:31






                  • 3




                    @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
                    – Peter Cordes
                    Sep 13 '15 at 3:59












                  up vote
                  136
                  down vote










                  up vote
                  136
                  down vote









                  Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.






                  share|improve this answer












                  Since the servers are physically next to each other, and you mentioned in the comments you have physical access to them, the fastest way would be to take the hard-drive out of the first computer, place it into the second, and transfer the files over the SATA connection.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Sep 7 '15 at 5:51









                  BlueRaja - Danny Pflughoeft

                  1,207166




                  1,207166







                  • 15




                    +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
                    – deworde
                    Sep 7 '15 at 10:06






                  • 3




                    I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
                    – Phizes
                    Sep 8 '15 at 1:52






                  • 3




                    @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
                    – Peter Cordes
                    Sep 9 '15 at 21:12






                  • 10




                    "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
                    – Kevin
                    Sep 11 '15 at 4:31






                  • 3




                    @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
                    – Peter Cordes
                    Sep 13 '15 at 3:59












                  • 15




                    +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
                    – deworde
                    Sep 7 '15 at 10:06






                  • 3




                    I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
                    – Phizes
                    Sep 8 '15 at 1:52






                  • 3




                    @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
                    – Peter Cordes
                    Sep 9 '15 at 21:12






                  • 10




                    "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
                    – Kevin
                    Sep 11 '15 at 4:31






                  • 3




                    @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
                    – Peter Cordes
                    Sep 13 '15 at 3:59







                  15




                  15




                  +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
                  – deworde
                  Sep 7 '15 at 10:06




                  +1: Transferring via physical seems to be the fastest route, even if it means getting a big external hard drive from somewhere. It's about £40, and you've probably spent that much in time already,
                  – deworde
                  Sep 7 '15 at 10:06




                  3




                  3




                  I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
                  – Phizes
                  Sep 8 '15 at 1:52




                  I completely disagree with this idea if one is getting full speed across a gigabit network. Testing over NFS/SMB over a Zyxel Gigabit switch between an HP Gen 7 microserver, and a Pentium G630 machine gives me ~100MB/s transfer. (Until I leave the outer edge of the drive platters.) So I think it'd realistically be done in under 3 hours. Unless you are using SSD's or extremely high performance drives/storage, I don't think 2 copies can produce a throughput of 100MB/s, that would require each copy operation to be 200MB/s just to break even.
                  – Phizes
                  Sep 8 '15 at 1:52




                  3




                  3




                  @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
                  – Peter Cordes
                  Sep 9 '15 at 21:12




                  @Phizes: obviously you don't copy to a temporary. That was deword's bad idea, not what everyone else is talking about. The point of connecting the source drive to the target machine is to go SATA->SATA with dd (or a filesystem tree copy).
                  – Peter Cordes
                  Sep 9 '15 at 21:12




                  10




                  10




                  "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
                  – Kevin
                  Sep 11 '15 at 4:31




                  "Never underestimate the bandwidth of a truck full of hard drives. One hell of a latency though"
                  – Kevin
                  Sep 11 '15 at 4:31




                  3




                  3




                  @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
                  – Peter Cordes
                  Sep 13 '15 at 3:59




                  @Kevin: yes, my point was that a direct copy between disks in the same computer is at least as fast as any other possible method. I brought up real-life bandwidth numbers to acknowledge Phize's point that going over gigE is fine for the OPs old drive, but a bottleneck for new drives. (One case where both drives in one computer is not the best option is when having separate computers using their RAM to cache the metadata of the source and dest is important, e.g. for rsync of billions of files.)
                  – Peter Cordes
                  Sep 13 '15 at 3:59












                  up vote
                  69
                  down vote













                  netcat is great for situations like this where security is not an issue:



                  # on destination machine, create listener on port 9999
                  nc -l 9999 > /path/to/outfile

                  # on source machine, send to destination:9999
                  nc destination_host_or_ip 9999 < /dev/sda
                  # or dd if=/dev/sda | nc destination_host_or_ip 9999


                  Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.



                  pv is even more helpful in reporting progress during the copy:



                  # on destination
                  nc -l 9999 | pv > /path/to/outfile

                  # on source
                  pv /dev/sda | nc destination_host_or_ip 9999
                  # or dd if=/dev/sda | pv | nc destination_host_or_ip 9999





                  share|improve this answer


















                  • 2




                    For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
                    – IQAndreas
                    Sep 7 '15 at 4:07







                  • 5




                    @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
                    – IQAndreas
                    Sep 7 '15 at 6:40






                  • 16




                    Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
                    – Incnis Mrsi
                    Sep 7 '15 at 16:41







                  • 5




                    Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:08






                  • 3




                    Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
                    – David Costa
                    Sep 8 '15 at 13:07














                  up vote
                  69
                  down vote













                  netcat is great for situations like this where security is not an issue:



                  # on destination machine, create listener on port 9999
                  nc -l 9999 > /path/to/outfile

                  # on source machine, send to destination:9999
                  nc destination_host_or_ip 9999 < /dev/sda
                  # or dd if=/dev/sda | nc destination_host_or_ip 9999


                  Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.



                  pv is even more helpful in reporting progress during the copy:



                  # on destination
                  nc -l 9999 | pv > /path/to/outfile

                  # on source
                  pv /dev/sda | nc destination_host_or_ip 9999
                  # or dd if=/dev/sda | pv | nc destination_host_or_ip 9999





                  share|improve this answer


















                  • 2




                    For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
                    – IQAndreas
                    Sep 7 '15 at 4:07







                  • 5




                    @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
                    – IQAndreas
                    Sep 7 '15 at 6:40






                  • 16




                    Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
                    – Incnis Mrsi
                    Sep 7 '15 at 16:41







                  • 5




                    Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:08






                  • 3




                    Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
                    – David Costa
                    Sep 8 '15 at 13:07












                  up vote
                  69
                  down vote










                  up vote
                  69
                  down vote









                  netcat is great for situations like this where security is not an issue:



                  # on destination machine, create listener on port 9999
                  nc -l 9999 > /path/to/outfile

                  # on source machine, send to destination:9999
                  nc destination_host_or_ip 9999 < /dev/sda
                  # or dd if=/dev/sda | nc destination_host_or_ip 9999


                  Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.



                  pv is even more helpful in reporting progress during the copy:



                  # on destination
                  nc -l 9999 | pv > /path/to/outfile

                  # on source
                  pv /dev/sda | nc destination_host_or_ip 9999
                  # or dd if=/dev/sda | pv | nc destination_host_or_ip 9999





                  share|improve this answer














                  netcat is great for situations like this where security is not an issue:



                  # on destination machine, create listener on port 9999
                  nc -l 9999 > /path/to/outfile

                  # on source machine, send to destination:9999
                  nc destination_host_or_ip 9999 < /dev/sda
                  # or dd if=/dev/sda | nc destination_host_or_ip 9999


                  Note, if you are using dd from GNU coreutils, you can send SIGUSR1 to the process and it will emit progress to stderr. For BSD dd, use SIGINFO.



                  pv is even more helpful in reporting progress during the copy:



                  # on destination
                  nc -l 9999 | pv > /path/to/outfile

                  # on source
                  pv /dev/sda | nc destination_host_or_ip 9999
                  # or dd if=/dev/sda | pv | nc destination_host_or_ip 9999






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 7 '15 at 4:29

























                  answered Sep 7 '15 at 4:00









                  zackse

                  1,17849




                  1,17849







                  • 2




                    For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
                    – IQAndreas
                    Sep 7 '15 at 4:07







                  • 5




                    @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
                    – IQAndreas
                    Sep 7 '15 at 6:40






                  • 16




                    Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
                    – Incnis Mrsi
                    Sep 7 '15 at 16:41







                  • 5




                    Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:08






                  • 3




                    Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
                    – David Costa
                    Sep 8 '15 at 13:07












                  • 2




                    For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
                    – IQAndreas
                    Sep 7 '15 at 4:07







                  • 5




                    @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
                    – IQAndreas
                    Sep 7 '15 at 6:40






                  • 16




                    Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
                    – Incnis Mrsi
                    Sep 7 '15 at 16:41







                  • 5




                    Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:08






                  • 3




                    Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
                    – David Costa
                    Sep 8 '15 at 13:07







                  2




                  2




                  For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
                  – IQAndreas
                  Sep 7 '15 at 4:07





                  For the second example, is dd even required, or can pv/nc treat /dev/sda just fine on their own? (I have noticed some commands "throw up" when trying to read special files like that one, or files with 0x00 bytes)
                  – IQAndreas
                  Sep 7 '15 at 4:07





                  5




                  5




                  @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
                  – IQAndreas
                  Sep 7 '15 at 6:40




                  @user1794469 Will the compression help? I'm thinking the network is not where the bottleneck is.
                  – IQAndreas
                  Sep 7 '15 at 6:40




                  16




                  16




                  Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
                  – Incnis Mrsi
                  Sep 7 '15 at 16:41





                  Don’t forget that in bash one can use > /dev/tcp/ IP / port and < /dev/tcp/ IP / port redirections instead of piping to and from netcat respectively.
                  – Incnis Mrsi
                  Sep 7 '15 at 16:41





                  5




                  5




                  Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
                  – Stéphane Gourichon
                  Sep 8 '15 at 9:08




                  Good answer. Gigabit Ethernet is often faster than hard drive speed, so compression is useless. To transfer several files consider tar cv sourcedir | pv | nc dest_host_or_ip 9999 and cd destdir ; nc -l 9999 | pv | tar xv. Many variations are possible, you might e.g. want to keep a .tar.gz at destination side rather than copies. If you copy directory to directory, for extra safety you can perform a rsync afterwards, e.g. from dest rsync --inplace -avP user@192.168.1.100:/path/to/source/. /path/to/destination/. it will guarantee that all files are indeed exact copies.
                  – Stéphane Gourichon
                  Sep 8 '15 at 9:08




                  3




                  3




                  Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
                  – David Costa
                  Sep 8 '15 at 13:07




                  Instead of using IPv4 you can achieve a better throughput by using IPv6 because it has a bigger payload. You don't even configure it, if the machines are IPv6 capable they probably already have an IPv6 link-local address
                  – David Costa
                  Sep 8 '15 at 13:07










                  up vote
                  33
                  down vote














                  1. Do use fast compression.



                    • Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.

                    • Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.

                    • Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.


                    • lz4 is your best option here:


                      LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.






                  2. Preferably do not unnecessarily seek.



                    • This can be difficult to gauge.


                    • If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:



                      </dev/zero tee >empty empty1 empty2; sync; rm empty*



                    • But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:



                      </dev/source_device lz4 | ...



                    • However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.



                      pax -r /source/tree[12] | lz4 | ...
                      mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...




                  3. Do not encrypt with ssh.



                    • Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.

                    • The PRNG needs the read data, or at least some of it, to sustain randomness.

                    • And of course you need to transfer the data as well.

                    • You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.


                    • And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:



                      ### on tgt machine...
                      nc -l 9999 > out.lz4
                      ### then on src machine...
                      ... lz4 | nc tgt.local 9999







                  share|improve this answer


















                  • 1




                    Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
                    – Engineer Dollery
                    Sep 9 '15 at 8:39










                  • @EngineerDollery - yes, that was dumb. I think it is better,
                    – mikeserv
                    Sep 10 '15 at 8:17










                  • @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
                    – Mike S
                    Sep 10 '15 at 13:38










                  • Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
                    – Walter Tross
                    Sep 10 '15 at 21:50











                  • @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
                    – mikeserv
                    Sep 10 '15 at 22:13














                  up vote
                  33
                  down vote














                  1. Do use fast compression.



                    • Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.

                    • Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.

                    • Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.


                    • lz4 is your best option here:


                      LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.






                  2. Preferably do not unnecessarily seek.



                    • This can be difficult to gauge.


                    • If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:



                      </dev/zero tee >empty empty1 empty2; sync; rm empty*



                    • But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:



                      </dev/source_device lz4 | ...



                    • However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.



                      pax -r /source/tree[12] | lz4 | ...
                      mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...




                  3. Do not encrypt with ssh.



                    • Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.

                    • The PRNG needs the read data, or at least some of it, to sustain randomness.

                    • And of course you need to transfer the data as well.

                    • You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.


                    • And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:



                      ### on tgt machine...
                      nc -l 9999 > out.lz4
                      ### then on src machine...
                      ... lz4 | nc tgt.local 9999







                  share|improve this answer


















                  • 1




                    Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
                    – Engineer Dollery
                    Sep 9 '15 at 8:39










                  • @EngineerDollery - yes, that was dumb. I think it is better,
                    – mikeserv
                    Sep 10 '15 at 8:17










                  • @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
                    – Mike S
                    Sep 10 '15 at 13:38










                  • Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
                    – Walter Tross
                    Sep 10 '15 at 21:50











                  • @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
                    – mikeserv
                    Sep 10 '15 at 22:13












                  up vote
                  33
                  down vote










                  up vote
                  33
                  down vote










                  1. Do use fast compression.



                    • Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.

                    • Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.

                    • Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.


                    • lz4 is your best option here:


                      LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.






                  2. Preferably do not unnecessarily seek.



                    • This can be difficult to gauge.


                    • If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:



                      </dev/zero tee >empty empty1 empty2; sync; rm empty*



                    • But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:



                      </dev/source_device lz4 | ...



                    • However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.



                      pax -r /source/tree[12] | lz4 | ...
                      mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...




                  3. Do not encrypt with ssh.



                    • Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.

                    • The PRNG needs the read data, or at least some of it, to sustain randomness.

                    • And of course you need to transfer the data as well.

                    • You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.


                    • And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:



                      ### on tgt machine...
                      nc -l 9999 > out.lz4
                      ### then on src machine...
                      ... lz4 | nc tgt.local 9999







                  share|improve this answer















                  1. Do use fast compression.



                    • Whatever your transfer medium - especially for network or usb - you'll be working with data bursts for reads, caches, and writes, and these will not exactly be in sync.

                    • Besides the disk firmware, disk caches, and kernel/ram caches, if you can also employ the systems' CPUs in some way to concentrate the amount of data exchanged per burst then you should do so.

                    • Any compression algorithm at all will automatically handle sparse runs of input as fast as possible, but there are very few that will handle the rest at network throughputs.


                    • lz4 is your best option here:


                      LZ4 is a very fast lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It also features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.






                  2. Preferably do not unnecessarily seek.



                    • This can be difficult to gauge.


                    • If there is a lot of free space on the device from which you copy, and the device has not been recently zeroed, but all of the source file-system(s) should be copied, then it is probably worth your while to first do something like:



                      </dev/zero tee >empty empty1 empty2; sync; rm empty*



                    • But that depends on what level you should be reading the source. It is usually desirable to read the device from start to finish from its /dev/some_disk device file, because reading at the file-system level will generally involve seeking back-and-forth and around the disk non-sequentially. And so your read command should be something like:



                      </dev/source_device lz4 | ...



                    • However, if your source file-system should not be transferred entire, then reading at the file-system level is fairly unavoidable, and so you should ball up your input contents into a stream. pax is generally the best and most simple solution in that case, but you might also consider mksquashfs as well.



                      pax -r /source/tree[12] | lz4 | ...
                      mksquashfs /source/tree[12] /dev/fd/1 -comp lz4 | ...




                  3. Do not encrypt with ssh.



                    • Adding encryption overhead to a trusted medium is unnecessary, and can be severely detrimental to the speed of sustained transfers in that the data read needs reading twice.

                    • The PRNG needs the read data, or at least some of it, to sustain randomness.

                    • And of course you need to transfer the data as well.

                    • You also need to transfer the encryption overhead itself - which means more work for less data transferred per burst.


                    • And so rather you should use netcat (or, as I prefer, the nmap project's more capable ncat) for a simple network copy, as has elsewhere been suggested:



                      ### on tgt machine...
                      nc -l 9999 > out.lz4
                      ### then on src machine...
                      ... lz4 | nc tgt.local 9999








                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Oct 18 '16 at 6:08

























                  answered Sep 8 '15 at 9:08









                  mikeserv

                  44.6k565150




                  44.6k565150







                  • 1




                    Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
                    – Engineer Dollery
                    Sep 9 '15 at 8:39










                  • @EngineerDollery - yes, that was dumb. I think it is better,
                    – mikeserv
                    Sep 10 '15 at 8:17










                  • @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
                    – Mike S
                    Sep 10 '15 at 13:38










                  • Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
                    – Walter Tross
                    Sep 10 '15 at 21:50











                  • @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
                    – mikeserv
                    Sep 10 '15 at 22:13












                  • 1




                    Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
                    – Engineer Dollery
                    Sep 9 '15 at 8:39










                  • @EngineerDollery - yes, that was dumb. I think it is better,
                    – mikeserv
                    Sep 10 '15 at 8:17










                  • @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
                    – Mike S
                    Sep 10 '15 at 13:38










                  • Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
                    – Walter Tross
                    Sep 10 '15 at 21:50











                  • @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
                    – mikeserv
                    Sep 10 '15 at 22:13







                  1




                  1




                  Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
                  – Engineer Dollery
                  Sep 9 '15 at 8:39




                  Fantastic answer. One minor grammatical point -- "lessen the amount of data which needs exchanging per burst" -- I think you're using compression to increase the information density as the 'bursts' are fixed-width and therefore the amount of exchanged data remains constant though the information transferred per burst may vary.
                  – Engineer Dollery
                  Sep 9 '15 at 8:39












                  @EngineerDollery - yes, that was dumb. I think it is better,
                  – mikeserv
                  Sep 10 '15 at 8:17




                  @EngineerDollery - yes, that was dumb. I think it is better,
                  – mikeserv
                  Sep 10 '15 at 8:17












                  @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
                  – Mike S
                  Sep 10 '15 at 13:38




                  @IQAndreas - I would seriously consider this answer. Personally I use pigz, and the speed increase is amazing. The parallelism is a huge win; CPUs are much faster than any other part of the data pipeline so I doubt parallel compression will slow you down (gzip is not parallelizable). You may find this fast enough that there's no incentive to juggle hard drives; I wouldn't be surprised if this one is overall faster (including disk swap time). You could benchmark with and without compression. In any case, either BlueRaja's diskswap answer or this one should be your accepted answer.
                  – Mike S
                  Sep 10 '15 at 13:38












                  Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
                  – Walter Tross
                  Sep 10 '15 at 21:50





                  Fast compression is an excellent advice. It should be noted, though, that it helps only if the data is reasonably compressible, which means, e.g., that it must not already be in a compressed format.
                  – Walter Tross
                  Sep 10 '15 at 21:50













                  @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
                  – mikeserv
                  Sep 10 '15 at 22:13




                  @WalterTross - it will help if any input is compressible, no matter the ratio, so long as the compression job outperforms the transfer job. On a modern four-core system an lz4 job should easily pace even wide-open GIGe, and USB 2.0 doesn't stand a chance. Besides, lz4 was designed only to work when it should - it is partly so fast because it knows when compression should be attempted and when it shouldn't. And if it is a device-file being transferred, then even precompressed input may compress somewhat anyway if there is any fragmentation in the source filesystem.
                  – mikeserv
                  Sep 10 '15 at 22:13










                  up vote
                  25
                  down vote













                  There are several limitations that could be limiting the transfer speed.



                  1. There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.


                  2. Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.


                  3. There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?


                  4. What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?


                  I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.



                  With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.



                  Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.






                  share|improve this answer


















                  • 11




                    +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
                    – Skrrp
                    Sep 7 '15 at 8:23






                  • 4




                    > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
                    – Bob
                    Sep 7 '15 at 15:38






                  • 2




                    @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
                    – MSalters
                    Sep 7 '15 at 17:07






                  • 1




                    40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
                    – hobbs
                    Sep 10 '15 at 18:00






                  • 3




                    Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
                    – derobert
                    Sep 10 '15 at 20:54















                  up vote
                  25
                  down vote













                  There are several limitations that could be limiting the transfer speed.



                  1. There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.


                  2. Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.


                  3. There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?


                  4. What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?


                  I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.



                  With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.



                  Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.






                  share|improve this answer


















                  • 11




                    +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
                    – Skrrp
                    Sep 7 '15 at 8:23






                  • 4




                    > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
                    – Bob
                    Sep 7 '15 at 15:38






                  • 2




                    @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
                    – MSalters
                    Sep 7 '15 at 17:07






                  • 1




                    40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
                    – hobbs
                    Sep 10 '15 at 18:00






                  • 3




                    Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
                    – derobert
                    Sep 10 '15 at 20:54













                  up vote
                  25
                  down vote










                  up vote
                  25
                  down vote









                  There are several limitations that could be limiting the transfer speed.



                  1. There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.


                  2. Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.


                  3. There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?


                  4. What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?


                  I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.



                  With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.



                  Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.






                  share|improve this answer














                  There are several limitations that could be limiting the transfer speed.



                  1. There is inherent network overhead on a 1Gbps pipe. Usually, this reduces ACTUAL throughput to 900Mbps or less. Then you have to remember that this is bidirectional traffic and you should expect significantly less than 900Mbps down.


                  2. Even though you're using a "new-ish router" are you certain that the router supports 1Gbps? Not all new routers support 1Gbps. Also, unless it is an enterprise-grade router, you likely are going to lose additional transmit bandwidth to the router being inefficient. Though based on what I found below, it looks like you're getting above 100Mbps.


                  3. There could be network congestion from other devices sharing your network. Have you tried using a directly attached cable as you said you were able to do?


                  4. What amount of your disk IO are you using? Likely, you're being limited, not by the network, but by the disk drive. Most 7200rpm HDDs will only get around 40MB/s. Are you using raid at all? Are you using SSDs? What are you using on the remote end?


                  I suggest using rsync if this is expected to be re-run for backups. You could also scp, ftp(s), or http using a downloader like filezilla on the other end as it will parallelize ssh/http/https/ftp connections. This can increase the bandwidth as the other solutions are over a single pipe. A single pipe/thread is still limited by the fact that it is single-threaded, which means that it could even be CPU bound.



                  With rsync, you take out a large amount of the complexity of your solution as well as allows compression, permission preservation, and allow partial transfers. There are several other reasons, but it is generally the preferred backup method (or runs the backup systems) of large enterprises. Commvault actually uses rsync underneath their software as the delivery mechanism for backups.



                  Based on your given example of 80GB/h, you're getting around 177Mbps (22.2MB/s). I feel you could easily double this with rsync on a dedicated ethernet line between the two boxes as I've managed to get this in my own tests with rsync over gigabit.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 7 '15 at 6:31









                  muru

                  33.9k578147




                  33.9k578147










                  answered Sep 7 '15 at 5:53









                  Khrystoph

                  40933




                  40933







                  • 11




                    +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
                    – Skrrp
                    Sep 7 '15 at 8:23






                  • 4




                    > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
                    – Bob
                    Sep 7 '15 at 15:38






                  • 2




                    @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
                    – MSalters
                    Sep 7 '15 at 17:07






                  • 1




                    40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
                    – hobbs
                    Sep 10 '15 at 18:00






                  • 3




                    Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
                    – derobert
                    Sep 10 '15 at 20:54













                  • 11




                    +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
                    – Skrrp
                    Sep 7 '15 at 8:23






                  • 4




                    > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
                    – Bob
                    Sep 7 '15 at 15:38






                  • 2




                    @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
                    – MSalters
                    Sep 7 '15 at 17:07






                  • 1




                    40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
                    – hobbs
                    Sep 10 '15 at 18:00






                  • 3




                    Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
                    – derobert
                    Sep 10 '15 at 20:54








                  11




                  11




                  +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
                  – Skrrp
                  Sep 7 '15 at 8:23




                  +1 for rsync. It might not be faster the first time you run it, but it certainly will be for all the subsequent times.
                  – Skrrp
                  Sep 7 '15 at 8:23




                  4




                  4




                  > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
                  – Bob
                  Sep 7 '15 at 15:38




                  > Most 7200rpm HDDs will only get around 40MB/s. IME you're more likely to see over 100MB/s sequential with a modern drive (and this includes ~5k drives). Though, this might be an older disk.
                  – Bob
                  Sep 7 '15 at 15:38




                  2




                  2




                  @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
                  – MSalters
                  Sep 7 '15 at 17:07




                  @Bob: Those modern still can read only 5400 circular tracks per minute. These disks are still fast because each track contains more than a megabyte. That does mean they're also quite big disks, A small 320 GB disk can't hold too many kilobytes per track, which necessarily limits their speed.
                  – MSalters
                  Sep 7 '15 at 17:07




                  1




                  1




                  40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
                  – hobbs
                  Sep 10 '15 at 18:00




                  40MB/s is definitely very pessimistic for sequential read for any drive made in the last decade. Current 7200RPM drives can exceed 100MB/s as Bob says.
                  – hobbs
                  Sep 10 '15 at 18:00




                  3




                  3




                  Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
                  – derobert
                  Sep 10 '15 at 20:54





                  Gigabit Ethernet is 1000 mbps full duplex. You get 1000mbps (or, as you say, around 900mbps in reality) each direction. Second... hard drives now routinely get 100MB/sec. 40MB/sec is slow, unless this is a decade old drive.
                  – derobert
                  Sep 10 '15 at 20:54











                  up vote
                  16
                  down vote













                  We deal with this regularly.



                  The two main methods we tend to use are:



                  1. SATA/eSATA/sneakernet

                  2. Direct NFS mount, then local cp or rsync

                  The first is dependent on whether the drive can be physically relocated. This is not always the case.



                  The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.



                  NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).



                  EDIT



                  I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.



                  FUTURE PROOFING



                  This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).



                  I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.



                  But...



                  Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.






                  share|improve this answer


















                  • 1




                    @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
                    – zxq9
                    Sep 8 '15 at 0:42







                  • 2




                    lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
                    – Peter Cordes
                    Sep 9 '15 at 21:34











                  • @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
                    – zxq9
                    Sep 10 '15 at 4:39






                  • 1




                    Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
                    – zxq9
                    Sep 10 '15 at 4:44






                  • 2




                    I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
                    – Engineer Dollery
                    Sep 10 '15 at 10:31















                  up vote
                  16
                  down vote













                  We deal with this regularly.



                  The two main methods we tend to use are:



                  1. SATA/eSATA/sneakernet

                  2. Direct NFS mount, then local cp or rsync

                  The first is dependent on whether the drive can be physically relocated. This is not always the case.



                  The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.



                  NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).



                  EDIT



                  I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.



                  FUTURE PROOFING



                  This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).



                  I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.



                  But...



                  Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.






                  share|improve this answer


















                  • 1




                    @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
                    – zxq9
                    Sep 8 '15 at 0:42







                  • 2




                    lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
                    – Peter Cordes
                    Sep 9 '15 at 21:34











                  • @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
                    – zxq9
                    Sep 10 '15 at 4:39






                  • 1




                    Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
                    – zxq9
                    Sep 10 '15 at 4:44






                  • 2




                    I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
                    – Engineer Dollery
                    Sep 10 '15 at 10:31













                  up vote
                  16
                  down vote










                  up vote
                  16
                  down vote









                  We deal with this regularly.



                  The two main methods we tend to use are:



                  1. SATA/eSATA/sneakernet

                  2. Direct NFS mount, then local cp or rsync

                  The first is dependent on whether the drive can be physically relocated. This is not always the case.



                  The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.



                  NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).



                  EDIT



                  I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.



                  FUTURE PROOFING



                  This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).



                  I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.



                  But...



                  Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.






                  share|improve this answer














                  We deal with this regularly.



                  The two main methods we tend to use are:



                  1. SATA/eSATA/sneakernet

                  2. Direct NFS mount, then local cp or rsync

                  The first is dependent on whether the drive can be physically relocated. This is not always the case.



                  The second works surprisingly well. Generally we max out a 1gbps connection rather easily with direct NFS mounts. You won't get anywhere close to this with scp, dd over ssh, or anything similar (you'll often get a max rate suspiciously close to 100mpbs). Even on very fast multicore processors you will hit a bottleneck on the max crypto throughput of one of the cores on the slowest of the two machines, which is depressingly slow compared to full-bore cp or rsync on an unencrypted network mount. Occasionally you'll hit an iops wall for a little while and be stuck at around ~53MB/s instead of the more typical ~110MB/s, but that is usually short lived unless the source or destination is actually a single drive, then you might wind up being limited by the sustained rate of the drive itself (which varies enough for random reasons you won't know until you actually try it) -- meh.



                  NFS can be a little annoying to set up if its on an unfamiliar distro, but generally speaking it has been the fastest way to fill the pipes as fully as possible. The last time I did this over 10gbps I never actually found out if it maxed out the connection, because the transfer was over before I came back from grabbing some coffee -- so there may be some natural limit you hit there. If you have a few network devices between the source and destination you can encounter some slight delays or hiccups from the network slinky effect, but generally this will work across the office (sans other traffic mucking it up) or from one end of the datacenter to the other (unless you have some sort of filtering/inspection occurring internally, in which case all bets are off).



                  EDIT



                  I noticed some chatter about compression... do not compress the connection. It will slow you down the same way a crypto layer will. The bottleneck will always be a single core if you compress the connection (and you won't even be getting particularly good utilization of that core's bus). The the slowest thing you can do in your situation is to use an encrypted, compressed channel between two computers sitting next to each other on a 1gbps or higher connection.



                  FUTURE PROOFING



                  This advice stands as of mid-2015. This will almost certainly not be the case for too many more years. So take everything with a grain of salt, and if you face this task regularly then try a variety of methods on actual loads instead of imagining you will get anything close to theoretical optimums, or even observed compression/crypto throughput rates typical for things like web traffic, much of which is textual (protip: bulk transfers usually consist chiefly of images, audio, video, database files, binary code, office file formats, etc. which are already compressed in their own way and benefit very little from being run through yet another compression routine, the compression block size of which is almost guaranteed to not align with your already compressed binary data...).



                  I imagine that in the future concepts like SCTP will be taken to a more interesting place, where bonded connections (or internally bonded-by-spectrum channelized fiber connections) are typical, and each channel can receive a stream independent of the others, and each stream can be compressed/encrypted in parallel, etc. etc. That would be wonderful! But that's not the case today in 2015, and though fantasizing and theorizing is nice, most of us don't have custom storage clusters running in a cryo-chamber feeding data directly to the innards of a Blue Gene/Q generating answers for Watson. That's just not reality. Nor do we have time to analyze our data payload exhaustively to figure out whether compression is a good idea or not -- the transfer itself would be over before we finished our analysis, regardless how bad the chosen method turned out to be.



                  But...



                  Times change and my recommendation against compression and encryption will not stand. I really would love for this advice to be overturned in the typical case very soon. It would make my life easier.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 10 '15 at 14:09

























                  answered Sep 7 '15 at 12:50









                  zxq9

                  28817




                  28817







                  • 1




                    @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
                    – zxq9
                    Sep 8 '15 at 0:42







                  • 2




                    lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
                    – Peter Cordes
                    Sep 9 '15 at 21:34











                  • @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
                    – zxq9
                    Sep 10 '15 at 4:39






                  • 1




                    Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
                    – zxq9
                    Sep 10 '15 at 4:44






                  • 2




                    I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
                    – Engineer Dollery
                    Sep 10 '15 at 10:31













                  • 1




                    @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
                    – zxq9
                    Sep 8 '15 at 0:42







                  • 2




                    lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
                    – Peter Cordes
                    Sep 9 '15 at 21:34











                  • @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
                    – zxq9
                    Sep 10 '15 at 4:39






                  • 1




                    Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
                    – zxq9
                    Sep 10 '15 at 4:44






                  • 2




                    I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
                    – Engineer Dollery
                    Sep 10 '15 at 10:31








                  1




                  1




                  @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
                  – zxq9
                  Sep 8 '15 at 0:42





                  @jofel Only when the network speed is slower than the processor's compression throughput -- which is never true for 1gpbs or higher connections. In the typical case, though, the network is the bottleneck, and compression does effectively speed things up -- but this is not the case the OP describes.
                  – zxq9
                  Sep 8 '15 at 0:42





                  2




                  2




                  lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
                  – Peter Cordes
                  Sep 9 '15 at 21:34





                  lz4 is fast enough to not bottleneck gigE, but depending on what you want to do with the copy, you might need it uncompressed. lzop is pretty fast, too. On my i5-2500k Sandybridge (3.8GHz), lz4 < /dev/raid0 | pv -a > /dev/null goes at ~180MB/s input, ~105MB/s output, just right for gigE. Decompressing on the receive side is even easier on the CPU.
                  – Peter Cordes
                  Sep 9 '15 at 21:34













                  @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
                  – zxq9
                  Sep 10 '15 at 4:39




                  @PeterCordes lz4 can indeed fit 1gbps connections roughly the same way other compression routines fit 100mpbs connections -- given the right circumstances, anyway. The basic problem is that compression itself is a bottleneck relative to a direct connection between systems, and usually a bottleneck between two systems on the same local network (typically 1gbps or 10gbps). The underlying lesson is that no setting is perfect for all environments and rules of thumb are not absolute. Fortunately, running a test transfer to figure out what works best in any immediate scenario is trivial.
                  – zxq9
                  Sep 10 '15 at 4:39




                  1




                  1




                  Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
                  – zxq9
                  Sep 10 '15 at 4:44




                  Also, 3.8GHz is a fair bit faster than most server processors run (or many business-grade systems of any flavor, at least that I'm used to seeing). It is more common to see much higher core counts with much lower clock speeds in data centers. Parallelization of transfer loads has not been an issue for a long time, so we are stuck with the max speed of a single core in most cases -- but I expect this will change now that clockspeeds are generally maxed out but network speeds still have a long way to go before hitting their maximums.
                  – zxq9
                  Sep 10 '15 at 4:44




                  2




                  2




                  I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
                  – Engineer Dollery
                  Sep 10 '15 at 10:31





                  I completely disagree with your comments regarding compression. It depends completely on the compressability of the data. If you could get a 99.9% compression ratio, it would be foolish not to do so -- why transfer 100GB when you can get away with transferring 100MB? I'm not suggesting that this level of compression is the case for this question, just showing that this has to be considered on a case by case basis and that there are no absolute rules.
                  – Engineer Dollery
                  Sep 10 '15 at 10:31











                  up vote
                  6
                  down vote













                  A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .



                  See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm



                  I've had very fast transfer speeds with this tool.






                  share|improve this answer
















                  • 1




                    The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:20














                  up vote
                  6
                  down vote













                  A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .



                  See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm



                  I've had very fast transfer speeds with this tool.






                  share|improve this answer
















                  • 1




                    The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:20












                  up vote
                  6
                  down vote










                  up vote
                  6
                  down vote









                  A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .



                  See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm



                  I've had very fast transfer speeds with this tool.






                  share|improve this answer












                  A nifty tool that I've used in the past is bbcp. As seen here: https://www.slac.stanford.edu/~abh/bbcp/ .



                  See also http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm



                  I've had very fast transfer speeds with this tool.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Sep 7 '15 at 5:16









                  DarkHeart

                  3,38822137




                  3,38822137







                  • 1




                    The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:20












                  • 1




                    The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
                    – Stéphane Gourichon
                    Sep 8 '15 at 9:20







                  1




                  1




                  The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
                  – Stéphane Gourichon
                  Sep 8 '15 at 9:20




                  The second link of this answer explains how to tune kernel parameters to reach higher speeds. The author there got 800 megabytes per second in a 10G links and some things seem applicable to 1Gbps links.
                  – Stéphane Gourichon
                  Sep 8 '15 at 9:20










                  up vote
                  5
                  down vote













                  If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:



                  rsync -varzP sourceFiles destination


                  Options are: verbose, archive mode, recursive, compress, Partial progress






                  share|improve this answer


















                  • 2




                    Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
                    – Tanath
                    Sep 9 '15 at 20:04










                  • Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
                    – lindhe
                    Jan 14 '16 at 20:22














                  up vote
                  5
                  down vote













                  If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:



                  rsync -varzP sourceFiles destination


                  Options are: verbose, archive mode, recursive, compress, Partial progress






                  share|improve this answer


















                  • 2




                    Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
                    – Tanath
                    Sep 9 '15 at 20:04










                  • Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
                    – lindhe
                    Jan 14 '16 at 20:22












                  up vote
                  5
                  down vote










                  up vote
                  5
                  down vote









                  If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:



                  rsync -varzP sourceFiles destination


                  Options are: verbose, archive mode, recursive, compress, Partial progress






                  share|improve this answer














                  If you get a first pass somehow (over the wire/sneakernet/whatever), you can look into rsync with certain options that can greatly speed up subsequent transfers. A very good way to go would be:



                  rsync -varzP sourceFiles destination


                  Options are: verbose, archive mode, recursive, compress, Partial progress







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 11 '15 at 17:51









                  A.L

                  4533617




                  4533617










                  answered Sep 9 '15 at 6:20









                  Hopping Bunny

                  1341




                  1341







                  • 2




                    Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
                    – Tanath
                    Sep 9 '15 at 20:04










                  • Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
                    – lindhe
                    Jan 14 '16 at 20:22












                  • 2




                    Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
                    – Tanath
                    Sep 9 '15 at 20:04










                  • Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
                    – lindhe
                    Jan 14 '16 at 20:22







                  2




                  2




                  Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
                  – Tanath
                  Sep 9 '15 at 20:04




                  Rsync is more reliable than netcat, but archive implies recursive, so the r is redundant.
                  – Tanath
                  Sep 9 '15 at 20:04












                  Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
                  – lindhe
                  Jan 14 '16 at 20:22




                  Also, -z can be incrediby slow depending on your CPU and what data you are processing. I've experienced transfers going from 30 MB/s to 125 MB/s when disableing compression.
                  – lindhe
                  Jan 14 '16 at 20:22










                  up vote
                  4
                  down vote













                  Added on insistence of the original poster in comments to zackse’s answer, although I’m not sure it is the fastest in typical circumstances.



                  bash has a special redirection syntax:

                  For output:     > /dev/tcp/ IP / port

                  For input:      < /dev/tcp/ IP / port
                  IP ban be either dotted-decimal IP or a hostname;
                  port ban be either a decimal number or a port name from /etc/services.



                  There is no actual /dev/tcp/ directory. It’s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).



                  Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.



                  Notes about netcat



                  There is an inconsistency in syntax between classical netcat and GNU netcat. I’ll use the classical syntax I’m accustomed to. Replace -lp with -l for GNU netcat.



                  Also, I’m unsure whether does GNU netcat accept -q switch.



                  Transferring a disk image



                  (Along the lines of zackse’s answer.)

                  On destination:



                  nc -lp 9999 >disk_image


                  On source:



                  dd if=/dev/sda >/dev/tcp/destination/9999
                   


                  Creating a tar.gz archive, with tar



                  On destination:



                  nc -lp 9999 >backup.tgz


                  On source:



                  tar cz files or directories to be transferred >/dev/tcp/destination/9999


                  Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.



                  Transferring with immediate expansion to file system



                  Also with tar.

                  On destination:



                  cd backups
                  tar x </dev/tcp/destination/9999


                  On source:



                  tar c files or directories to be transferred |nc -q 1 -lp 9999


                  It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e. g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeserv’s answer for details about compression.



                  Alternative style (the destination listens port)



                  On destination:



                  cd backups
                  nc -lp 9999 |tar x


                  On source:



                  tar c files or directories to be transferred >/dev/tcp/destination/9999





                  share|improve this answer






















                  • bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
                    – rogerdpack
                    Sep 27 at 22:07














                  up vote
                  4
                  down vote













                  Added on insistence of the original poster in comments to zackse’s answer, although I’m not sure it is the fastest in typical circumstances.



                  bash has a special redirection syntax:

                  For output:     > /dev/tcp/ IP / port

                  For input:      < /dev/tcp/ IP / port
                  IP ban be either dotted-decimal IP or a hostname;
                  port ban be either a decimal number or a port name from /etc/services.



                  There is no actual /dev/tcp/ directory. It’s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).



                  Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.



                  Notes about netcat



                  There is an inconsistency in syntax between classical netcat and GNU netcat. I’ll use the classical syntax I’m accustomed to. Replace -lp with -l for GNU netcat.



                  Also, I’m unsure whether does GNU netcat accept -q switch.



                  Transferring a disk image



                  (Along the lines of zackse’s answer.)

                  On destination:



                  nc -lp 9999 >disk_image


                  On source:



                  dd if=/dev/sda >/dev/tcp/destination/9999
                   


                  Creating a tar.gz archive, with tar



                  On destination:



                  nc -lp 9999 >backup.tgz


                  On source:



                  tar cz files or directories to be transferred >/dev/tcp/destination/9999


                  Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.



                  Transferring with immediate expansion to file system



                  Also with tar.

                  On destination:



                  cd backups
                  tar x </dev/tcp/destination/9999


                  On source:



                  tar c files or directories to be transferred |nc -q 1 -lp 9999


                  It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e. g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeserv’s answer for details about compression.



                  Alternative style (the destination listens port)



                  On destination:



                  cd backups
                  nc -lp 9999 |tar x


                  On source:



                  tar c files or directories to be transferred >/dev/tcp/destination/9999





                  share|improve this answer






















                  • bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
                    – rogerdpack
                    Sep 27 at 22:07












                  up vote
                  4
                  down vote










                  up vote
                  4
                  down vote









                  Added on insistence of the original poster in comments to zackse’s answer, although I’m not sure it is the fastest in typical circumstances.



                  bash has a special redirection syntax:

                  For output:     > /dev/tcp/ IP / port

                  For input:      < /dev/tcp/ IP / port
                  IP ban be either dotted-decimal IP or a hostname;
                  port ban be either a decimal number or a port name from /etc/services.



                  There is no actual /dev/tcp/ directory. It’s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).



                  Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.



                  Notes about netcat



                  There is an inconsistency in syntax between classical netcat and GNU netcat. I’ll use the classical syntax I’m accustomed to. Replace -lp with -l for GNU netcat.



                  Also, I’m unsure whether does GNU netcat accept -q switch.



                  Transferring a disk image



                  (Along the lines of zackse’s answer.)

                  On destination:



                  nc -lp 9999 >disk_image


                  On source:



                  dd if=/dev/sda >/dev/tcp/destination/9999
                   


                  Creating a tar.gz archive, with tar



                  On destination:



                  nc -lp 9999 >backup.tgz


                  On source:



                  tar cz files or directories to be transferred >/dev/tcp/destination/9999


                  Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.



                  Transferring with immediate expansion to file system



                  Also with tar.

                  On destination:



                  cd backups
                  tar x </dev/tcp/destination/9999


                  On source:



                  tar c files or directories to be transferred |nc -q 1 -lp 9999


                  It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e. g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeserv’s answer for details about compression.



                  Alternative style (the destination listens port)



                  On destination:



                  cd backups
                  nc -lp 9999 |tar x


                  On source:



                  tar c files or directories to be transferred >/dev/tcp/destination/9999





                  share|improve this answer














                  Added on insistence of the original poster in comments to zackse’s answer, although I’m not sure it is the fastest in typical circumstances.



                  bash has a special redirection syntax:

                  For output:     > /dev/tcp/ IP / port

                  For input:      < /dev/tcp/ IP / port
                  IP ban be either dotted-decimal IP or a hostname;
                  port ban be either a decimal number or a port name from /etc/services.



                  There is no actual /dev/tcp/ directory. It’s a special syntactic kludge that commands bash to create a TCP socket, connect it to the destination specified, and then do the same thing as a usual file redirection does (namely, replace the respective standard stream with the socket using dup2(2)).



                  Hence, one can stream data from dd or tar at the source machine directly via TCP. Or, conversely, to stream data to tar or something alike directly via TCP. In any case, one superfluous netcat is eliminated.



                  Notes about netcat



                  There is an inconsistency in syntax between classical netcat and GNU netcat. I’ll use the classical syntax I’m accustomed to. Replace -lp with -l for GNU netcat.



                  Also, I’m unsure whether does GNU netcat accept -q switch.



                  Transferring a disk image



                  (Along the lines of zackse’s answer.)

                  On destination:



                  nc -lp 9999 >disk_image


                  On source:



                  dd if=/dev/sda >/dev/tcp/destination/9999
                   


                  Creating a tar.gz archive, with tar



                  On destination:



                  nc -lp 9999 >backup.tgz


                  On source:



                  tar cz files or directories to be transferred >/dev/tcp/destination/9999


                  Replace .tgz with .tbz and cz with cj to get a bzip2-compressed archive.



                  Transferring with immediate expansion to file system



                  Also with tar.

                  On destination:



                  cd backups
                  tar x </dev/tcp/destination/9999


                  On source:



                  tar c files or directories to be transferred |nc -q 1 -lp 9999


                  It will work without -q 1, but netcat will stuck when data ended. See tar(1) for explanation of the syntax and caveats of tar. If there are many files with high redundancy (low entropy), then compression (e. g. cz and xz instead of c and x) can be tried, but if files are typical and the network is fast enough, it would only slow the process. See mikeserv’s answer for details about compression.



                  Alternative style (the destination listens port)



                  On destination:



                  cd backups
                  nc -lp 9999 |tar x


                  On source:



                  tar c files or directories to be transferred >/dev/tcp/destination/9999






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Sep 9 '15 at 17:54









                  A.L

                  4533617




                  4533617










                  answered Sep 9 '15 at 10:16









                  Incnis Mrsi

                  1,171520




                  1,171520











                  • bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
                    – rogerdpack
                    Sep 27 at 22:07
















                  • bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
                    – rogerdpack
                    Sep 27 at 22:07















                  bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
                  – rogerdpack
                  Sep 27 at 22:07




                  bash can't actually "listen" on a socket apparently, in order to wait and receive a file: unix.stackexchange.com/questions/49936/… so you'd have to use something else for at least one half of the connection...
                  – rogerdpack
                  Sep 27 at 22:07










                  up vote
                  3
                  down vote













                  Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.






                  share|improve this answer
























                    up vote
                    3
                    down vote













                    Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.






                    share|improve this answer






















                      up vote
                      3
                      down vote










                      up vote
                      3
                      down vote









                      Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.






                      share|improve this answer












                      Try the suggestions regarding direct connections and avoiding encrypted protocols such as ssh. Then if you still want to eke out every bit of performance give this site a read: https://fasterdata.es.net/host-tuning/linux/ for some advice on optimizing your TCP windows.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Sep 7 '15 at 22:20









                      Brandon Xavier

                      25617




                      25617




















                          up vote
                          2
                          down vote













                          I would use this script I wrote that needs the socat package.



                          On the source machine:



                          tarnet -d wherefilesaretosend pass=none 12345 .


                          On the target machine:



                          tarnet -d wherefilesaretogo pass=none sourceip/12345


                          If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
                          The pass= option can be used where the data might be exposed (slower).



                          Edit:



                          Use the -n option to disable compression, if CPU is a bottle neck.






                          share|improve this answer


























                            up vote
                            2
                            down vote













                            I would use this script I wrote that needs the socat package.



                            On the source machine:



                            tarnet -d wherefilesaretosend pass=none 12345 .


                            On the target machine:



                            tarnet -d wherefilesaretogo pass=none sourceip/12345


                            If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
                            The pass= option can be used where the data might be exposed (slower).



                            Edit:



                            Use the -n option to disable compression, if CPU is a bottle neck.






                            share|improve this answer
























                              up vote
                              2
                              down vote










                              up vote
                              2
                              down vote









                              I would use this script I wrote that needs the socat package.



                              On the source machine:



                              tarnet -d wherefilesaretosend pass=none 12345 .


                              On the target machine:



                              tarnet -d wherefilesaretogo pass=none sourceip/12345


                              If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
                              The pass= option can be used where the data might be exposed (slower).



                              Edit:



                              Use the -n option to disable compression, if CPU is a bottle neck.






                              share|improve this answer














                              I would use this script I wrote that needs the socat package.



                              On the source machine:



                              tarnet -d wherefilesaretosend pass=none 12345 .


                              On the target machine:



                              tarnet -d wherefilesaretogo pass=none sourceip/12345


                              If the vbuf package (Debian, Ubuntu) is there then the file sender will show a data progress. The file receiver will show what files are received.
                              The pass= option can be used where the data might be exposed (slower).



                              Edit:



                              Use the -n option to disable compression, if CPU is a bottle neck.







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Sep 7 '15 at 9:18

























                              answered Sep 7 '15 at 9:11









                              Skaperen

                              592412




                              592412




















                                  up vote
                                  2
                                  down vote













                                  If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!



                                  This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.



                                  Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?






                                  share|improve this answer
























                                    up vote
                                    2
                                    down vote













                                    If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!



                                    This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.



                                    Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?






                                    share|improve this answer






















                                      up vote
                                      2
                                      down vote










                                      up vote
                                      2
                                      down vote









                                      If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!



                                      This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.



                                      Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?






                                      share|improve this answer












                                      If budget is not the main concern, you could try connecting the drives with a Intel Xeon E5 12 core "drive connector". This connector is usually so powerful, that you can even run your current server software on it. From both servers!



                                      This might look like a fun answer, but you should really consider why you are moving the data between servers and if a big one with shared memory and storage might make more sense.



                                      Not sure about current specs, but the slow transfer might be limited by the disk speeds, not the network?







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Sep 8 '15 at 0:19









                                      user133111

                                      211




                                      211




















                                          up vote
                                          1
                                          down vote













                                          If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.



                                          My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.



                                          If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.



                                          A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.



                                          That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).






                                          share|improve this answer
















                                          • 1




                                            how does 400G of RAM speed up data transfer?
                                            – Skaperen
                                            Sep 7 '15 at 9:39










                                          • Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
                                            – MichaelS
                                            Sep 7 '15 at 10:07










                                          • Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
                                            – coteyr
                                            Sep 7 '15 at 12:07










                                          • Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
                                            – coteyr
                                            Sep 7 '15 at 12:10







                                          • 1




                                            HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
                                            – A.L
                                            Sep 11 '15 at 17:36














                                          up vote
                                          1
                                          down vote













                                          If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.



                                          My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.



                                          If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.



                                          A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.



                                          That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).






                                          share|improve this answer
















                                          • 1




                                            how does 400G of RAM speed up data transfer?
                                            – Skaperen
                                            Sep 7 '15 at 9:39










                                          • Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
                                            – MichaelS
                                            Sep 7 '15 at 10:07










                                          • Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
                                            – coteyr
                                            Sep 7 '15 at 12:07










                                          • Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
                                            – coteyr
                                            Sep 7 '15 at 12:10







                                          • 1




                                            HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
                                            – A.L
                                            Sep 11 '15 at 17:36












                                          up vote
                                          1
                                          down vote










                                          up vote
                                          1
                                          down vote









                                          If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.



                                          My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.



                                          If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.



                                          A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.



                                          That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).






                                          share|improve this answer












                                          If you only care about backups, and not about a byte for byte copy of the hard drive, then I would recommend backupPC. http://backuppc.sourceforge.net/faq/BackupPC.html It's a bit of a pain to setup but it transfers very quickly.



                                          My initial transfer time for about 500G of data was around 3 hours. Subsequent backups happen in about 20 seconds.



                                          If your not interested in backups, but are trying to sync things then rsync or unison would better fit your needs.



                                          A byte for byte copy of a hard disk is usually a horrid idea for backup purposes (no incrementals, no space saving, drive can't be in use, you have to backup the "empty space", and you have to back up garbage (like a 16 G swap file or 200G of core dumps or some such). Using rsync (or backuppc or others) you can create "snapshots" in time so you can go to "what your file system looked like 30 mins ago" with very little overhead.



                                          That said, if your really want to transfer a byte for byte copy then your problem is going to lie in the transfer and not in the getting data from the drive. With out 400G of RAM a 320G file transfer is going to take a very ling time. Using protocols that are not encrypted are an option, but no matter what, your just going to have to sit there and wait for several hours (over the network).







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Sep 7 '15 at 8:27









                                          coteyr

                                          3,415821




                                          3,415821







                                          • 1




                                            how does 400G of RAM speed up data transfer?
                                            – Skaperen
                                            Sep 7 '15 at 9:39










                                          • Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
                                            – MichaelS
                                            Sep 7 '15 at 10:07










                                          • Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
                                            – coteyr
                                            Sep 7 '15 at 12:07










                                          • Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
                                            – coteyr
                                            Sep 7 '15 at 12:10







                                          • 1




                                            HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
                                            – A.L
                                            Sep 11 '15 at 17:36












                                          • 1




                                            how does 400G of RAM speed up data transfer?
                                            – Skaperen
                                            Sep 7 '15 at 9:39










                                          • Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
                                            – MichaelS
                                            Sep 7 '15 at 10:07










                                          • Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
                                            – coteyr
                                            Sep 7 '15 at 12:07










                                          • Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
                                            – coteyr
                                            Sep 7 '15 at 12:10







                                          • 1




                                            HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
                                            – A.L
                                            Sep 11 '15 at 17:36







                                          1




                                          1




                                          how does 400G of RAM speed up data transfer?
                                          – Skaperen
                                          Sep 7 '15 at 9:39




                                          how does 400G of RAM speed up data transfer?
                                          – Skaperen
                                          Sep 7 '15 at 9:39












                                          Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
                                          – MichaelS
                                          Sep 7 '15 at 10:07




                                          Not sure this was the intent, but I read it as "any medium slower than RAM to RAM transfer is going to take a while", rather than "buy 400 GB of RAM and your HDD to HDD transfer will go faster".
                                          – MichaelS
                                          Sep 7 '15 at 10:07












                                          Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
                                          – coteyr
                                          Sep 7 '15 at 12:07




                                          Yep,, ram will buffer for you, and it will seem faster. You can do a HD to HD transfer with RAM buffering all the way and it will seem very fast. It will also take quite a wile to flush to disk, but HD to RAM to RAM to HD is faster then HD to HD. (Keep in mind you have to do HD to RAM to RAM to HD anyway but if you have less then your entire transfer size of RAM you will have to "flush" in segments.)
                                          – coteyr
                                          Sep 7 '15 at 12:07












                                          Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
                                          – coteyr
                                          Sep 7 '15 at 12:10





                                          Another way to put is that to compress or even just send the entire source drive has to be read in to ram. If it doesn't fit all at once, it has to read a segment, send, discard segment, seek, read segment, etc. If it fits all at once then it just has to read all at one time. Same on the destination.
                                          – coteyr
                                          Sep 7 '15 at 12:10





                                          1




                                          1




                                          HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
                                          – A.L
                                          Sep 11 '15 at 17:36




                                          HD to RAM to RAM to HD is faster then HD to HD How can it be quicker?
                                          – A.L
                                          Sep 11 '15 at 17:36










                                          up vote
                                          1
                                          down vote













                                          Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.



                                          Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.






                                          share|improve this answer
















                                          • 2




                                            Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
                                            – STW
                                            Sep 10 '15 at 13:22










                                          • I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:42










                                          • This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:45














                                          up vote
                                          1
                                          down vote













                                          Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.



                                          Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.






                                          share|improve this answer
















                                          • 2




                                            Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
                                            – STW
                                            Sep 10 '15 at 13:22










                                          • I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:42










                                          • This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:45












                                          up vote
                                          1
                                          down vote










                                          up vote
                                          1
                                          down vote









                                          Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.



                                          Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.






                                          share|improve this answer












                                          Regardless of program, I have usually found that "pulling" files over a network is faster than "pushing". That is, logging into the destination computer and doing a read is faster than logging into the source computer and doing a write.



                                          Also, if you are going to use an intermediate drive, consider this: Get an external drive (either as a package, or a separate drive plugged into a docking station) which uses eSATA rather than USB. Then on each of the two computers either install a card with an eSATA port, or get a simple adapter cable which brings one of the internal SATA ports to an external eSATA connector. Then plug the drive into the source computer, power up the drive, and wait for it to auto-mount (you could mount manaully, but if you are doing this repeatedly you might as well put it into your fstab file). Then copy; you will be writing at the same speed as to an internal drive. Then unmount the drive, power down, plug into the other computer, power up, wait for an auto-mount, and read.







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Sep 7 '15 at 21:58









                                          Mike Ciaraldi

                                          192




                                          192







                                          • 2




                                            Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
                                            – STW
                                            Sep 10 '15 at 13:22










                                          • I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:42










                                          • This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:45












                                          • 2




                                            Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
                                            – STW
                                            Sep 10 '15 at 13:22










                                          • I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:42










                                          • This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
                                            – Mike Ciaraldi
                                            Sep 10 '15 at 17:45







                                          2




                                          2




                                          Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
                                          – STW
                                          Sep 10 '15 at 13:22




                                          Can you provide specifics of how you're "pulling" files? What utilities are you using, and can you provide any sample showing this effect?
                                          – STW
                                          Sep 10 '15 at 13:22












                                          I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
                                          – Mike Ciaraldi
                                          Sep 10 '15 at 17:42




                                          I am not sure if this will be a more complete answer, but consider this scenario: Suppose you have two computers, foo and bar, and you want to copy data from foo to bar. (1) You log into foo, then remote mount the drive which is physically attached to bar. Then you copy from foo's disk onto the remotely-mounted directory (which is physically on bar). I called this pushing the data to the other computer. (2) Compare this to the other way of copying the same data. Log into bar, remote-mount the directory attached to foo, and read from foo onto bar's drive. This is pulling.
                                          – Mike Ciaraldi
                                          Sep 10 '15 at 17:42












                                          This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
                                          – Mike Ciaraldi
                                          Sep 10 '15 at 17:45




                                          This copying can be done with the Linux cp command, from a a GUI file manager, or any other way of copying files. I think pulling turns out to be faster because writing is slower than reading, and more of the decisions on how to write to the destination disk are being done on the same computer the drive is attached to, so there is less overhead. But maybe this is no longer the case with more modern systems.
                                          – Mike Ciaraldi
                                          Sep 10 '15 at 17:45










                                          up vote
                                          1
                                          down vote













                                          I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.






                                          share|improve this answer




















                                          • If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
                                            – STW
                                            Sep 9 '15 at 21:04










                                          • @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
                                            – Peter Cordes
                                            Sep 10 '15 at 6:03











                                          • is there a specific name for this feature in switches?
                                            – STW
                                            Sep 10 '15 at 13:21










                                          • There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
                                            – Byron Jones
                                            Sep 11 '15 at 20:18










                                          • 802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
                                            – Dan Pritts
                                            Sep 14 '15 at 14:02














                                          up vote
                                          1
                                          down vote













                                          I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.






                                          share|improve this answer




















                                          • If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
                                            – STW
                                            Sep 9 '15 at 21:04










                                          • @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
                                            – Peter Cordes
                                            Sep 10 '15 at 6:03











                                          • is there a specific name for this feature in switches?
                                            – STW
                                            Sep 10 '15 at 13:21










                                          • There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
                                            – Byron Jones
                                            Sep 11 '15 at 20:18










                                          • 802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
                                            – Dan Pritts
                                            Sep 14 '15 at 14:02












                                          up vote
                                          1
                                          down vote










                                          up vote
                                          1
                                          down vote









                                          I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.






                                          share|improve this answer












                                          I'm going to recommend that you look at NIC-teaming. This involves using multiple network connections running in parallel. Assuming that you really need more than 1Gb transfer, and that 10Gb is cost prohibitive, 2Gbs provided by NIC-teaming would be a minor cost, and your computers may already have the extra ports.







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered Sep 8 '15 at 19:18









                                          Byron Jones

                                          1192




                                          1192











                                          • If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
                                            – STW
                                            Sep 9 '15 at 21:04










                                          • @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
                                            – Peter Cordes
                                            Sep 10 '15 at 6:03











                                          • is there a specific name for this feature in switches?
                                            – STW
                                            Sep 10 '15 at 13:21










                                          • There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
                                            – Byron Jones
                                            Sep 11 '15 at 20:18










                                          • 802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
                                            – Dan Pritts
                                            Sep 14 '15 at 14:02
















                                          • If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
                                            – STW
                                            Sep 9 '15 at 21:04










                                          • @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
                                            – Peter Cordes
                                            Sep 10 '15 at 6:03











                                          • is there a specific name for this feature in switches?
                                            – STW
                                            Sep 10 '15 at 13:21










                                          • There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
                                            – Byron Jones
                                            Sep 11 '15 at 20:18










                                          • 802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
                                            – Dan Pritts
                                            Sep 14 '15 at 14:02















                                          If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
                                          – STW
                                          Sep 9 '15 at 21:04




                                          If you're referring to LACP (Link Aggregation Control Protocol) then you're not going to see an increase in speed. It Provided redundancy and some ability to serve more concurrent connections, but it won't provide a speed boost for this type of transfer.
                                          – STW
                                          Sep 9 '15 at 21:04












                                          @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
                                          – Peter Cordes
                                          Sep 10 '15 at 6:03





                                          @STW: It requires switch support to aggregate two links to one machine into a 2gbit link, but it is possible. Helpful only if both machines have a 2gbit link to the switch, though. If you have two cables running NIC<->NIC, with no switch, that should work too, but isn't very useful (unless you have a 3rd NIC in one machine to keep them connected to the Internet).
                                          – Peter Cordes
                                          Sep 10 '15 at 6:03













                                          is there a specific name for this feature in switches?
                                          – STW
                                          Sep 10 '15 at 13:21




                                          is there a specific name for this feature in switches?
                                          – STW
                                          Sep 10 '15 at 13:21












                                          There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
                                          – Byron Jones
                                          Sep 11 '15 at 20:18




                                          There are several variations of NIC-teaming, EtherChannel, etc. STW is right for certain configurations, this won't help, but for some configurations, it would. It comes down to whether or not the bonded channel speeds up performance for a single IP socket or not. You'll need to research the specifics to determine if this is a viable solution for you.
                                          – Byron Jones
                                          Sep 11 '15 at 20:18












                                          802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
                                          – Dan Pritts
                                          Sep 14 '15 at 14:02




                                          802.3ad is the open standard that you'd look for on your switches. As a quick hack, though, you might just connect extra NICs to the network, and give them appropriate IP addresses on separate subnets in private address space. (host 1 port a & host 2 port a get one subnet, host 1 port b and host 2 port b get another subnet). Then just run two parallel jobs to do the transfer. This will be a lot simpler than learning the ins and outs of Etherchannel, 802.3ad, etc.
                                          – Dan Pritts
                                          Sep 14 '15 at 14:02










                                          up vote
                                          1
                                          down vote













                                          FWIW, I've always used this:



                                          tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"


                                          Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
                                          (Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)



                                          Just tested this between two busy servers and managed ~14GB in 216s
                                          (about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV



                                          $ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
                                          Wed Sep 9 15:23:37 EDT 2015
                                          Wed Sep 9 15:27:13 EDT 2015

                                          $ du -s Installers
                                          14211072 Installers





                                          share|improve this answer


























                                            up vote
                                            1
                                            down vote













                                            FWIW, I've always used this:



                                            tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"


                                            Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
                                            (Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)



                                            Just tested this between two busy servers and managed ~14GB in 216s
                                            (about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV



                                            $ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
                                            Wed Sep 9 15:23:37 EDT 2015
                                            Wed Sep 9 15:27:13 EDT 2015

                                            $ du -s Installers
                                            14211072 Installers





                                            share|improve this answer
























                                              up vote
                                              1
                                              down vote










                                              up vote
                                              1
                                              down vote









                                              FWIW, I've always used this:



                                              tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"


                                              Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
                                              (Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)



                                              Just tested this between two busy servers and managed ~14GB in 216s
                                              (about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV



                                              $ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
                                              Wed Sep 9 15:23:37 EDT 2015
                                              Wed Sep 9 15:27:13 EDT 2015

                                              $ du -s Installers
                                              14211072 Installers





                                              share|improve this answer














                                              FWIW, I've always used this:



                                              tar -cpf - <source path> | ssh user@destserver "cd /; tar xf -"


                                              Thing about this method is that it will maintain file/folder permissions between machines (assuming the same users/groups exist on both)
                                              (Also I typically do this to copy virtual disk images since I can use a -S parameter to handle sparse files.)



                                              Just tested this between two busy servers and managed ~14GB in 216s
                                              (about 64MB/s) - might would do better between dedicated machines and/or compression... YMMV



                                              $ date; tar -cpf - Installers | ssh elvis "cd /home/elvis/tst; tar xf -"; date
                                              Wed Sep 9 15:23:37 EDT 2015
                                              Wed Sep 9 15:27:13 EDT 2015

                                              $ du -s Installers
                                              14211072 Installers






                                              share|improve this answer














                                              share|improve this answer



                                              share|improve this answer








                                              edited Sep 9 '15 at 21:00









                                              muru

                                              33.9k578147




                                              33.9k578147










                                              answered Sep 9 '15 at 20:27









                                              ttstooge

                                              111




                                              111




















                                                  up vote
                                                  1
                                                  down vote













                                                  Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).



                                                  The restore target can be a file on the target system.



                                                  If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.



                                                  I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.



                                                  Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.






                                                  share|improve this answer
























                                                    up vote
                                                    1
                                                    down vote













                                                    Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).



                                                    The restore target can be a file on the target system.



                                                    If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.



                                                    I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.



                                                    Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.






                                                    share|improve this answer






















                                                      up vote
                                                      1
                                                      down vote










                                                      up vote
                                                      1
                                                      down vote









                                                      Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).



                                                      The restore target can be a file on the target system.



                                                      If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.



                                                      I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.



                                                      Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.






                                                      share|improve this answer












                                                      Unless you want to do filesystem forensics, use a dump/restore program for your filesystem to avoid copying the free space that the FS isn't using. Depending on the what filesystem you have, this will typically preserve all metadata, including ctime. inode numbers may change, though, again depending on what filesystem (xfs, ext4, ufs...).



                                                      The restore target can be a file on the target system.



                                                      If you want a full-disk image with the partition table, you can dd the first 1M of the disk to get the partition table / bootloaders / stuff, but then xfsdump the partitions.



                                                      I can't tell from your info-dump what kind of filesystem you actually have. If it's BSD ufs, then I think that has a dump/restore program. If it's ZFS, well IDK, there might be something.



                                                      Generally full-copying disks around is too slow for anything except recovery situations. You can't do incremental backups that way, either.







                                                      share|improve this answer












                                                      share|improve this answer



                                                      share|improve this answer










                                                      answered Sep 9 '15 at 21:47









                                                      Peter Cordes

                                                      4,1121032




                                                      4,1121032




















                                                          up vote
                                                          1
                                                          down vote













                                                          You could also setup the systems to have a shared storage!



                                                          I am considering that these are next to each other, and you are likely to do this again & again ....






                                                          share|improve this answer
























                                                            up vote
                                                            1
                                                            down vote













                                                            You could also setup the systems to have a shared storage!



                                                            I am considering that these are next to each other, and you are likely to do this again & again ....






                                                            share|improve this answer






















                                                              up vote
                                                              1
                                                              down vote










                                                              up vote
                                                              1
                                                              down vote









                                                              You could also setup the systems to have a shared storage!



                                                              I am considering that these are next to each other, and you are likely to do this again & again ....






                                                              share|improve this answer












                                                              You could also setup the systems to have a shared storage!



                                                              I am considering that these are next to each other, and you are likely to do this again & again ....







                                                              share|improve this answer












                                                              share|improve this answer



                                                              share|improve this answer










                                                              answered Sep 10 '15 at 9:29









                                                              user133526

                                                              111




                                                              111




















                                                                  up vote
                                                                  1
                                                                  down vote













                                                                  How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.



                                                                  Here's a similar question with some examples of that kind of solution.



                                                                  Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.



                                                                  To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0



                                                                  Good luck!



                                                                  Edit:



                                                                  @Khrystoph touched on this in his answer






                                                                  share|improve this answer






















                                                                  • How will it improve speed rates? Can you please explain it your answer?
                                                                    – A.L
                                                                    Sep 11 '15 at 17:43







                                                                  • 1




                                                                    It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                                                                    – Dan Pritts
                                                                    Sep 11 '15 at 20:47














                                                                  up vote
                                                                  1
                                                                  down vote













                                                                  How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.



                                                                  Here's a similar question with some examples of that kind of solution.



                                                                  Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.



                                                                  To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0



                                                                  Good luck!



                                                                  Edit:



                                                                  @Khrystoph touched on this in his answer






                                                                  share|improve this answer






















                                                                  • How will it improve speed rates? Can you please explain it your answer?
                                                                    – A.L
                                                                    Sep 11 '15 at 17:43







                                                                  • 1




                                                                    It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                                                                    – Dan Pritts
                                                                    Sep 11 '15 at 20:47












                                                                  up vote
                                                                  1
                                                                  down vote










                                                                  up vote
                                                                  1
                                                                  down vote









                                                                  How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.



                                                                  Here's a similar question with some examples of that kind of solution.



                                                                  Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.



                                                                  To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0



                                                                  Good luck!



                                                                  Edit:



                                                                  @Khrystoph touched on this in his answer






                                                                  share|improve this answer














                                                                  How about an ethernet crossover cable? Instead of relying on wireless speeds you're capped at the wired speed of your NIC.



                                                                  Here's a similar question with some examples of that kind of solution.



                                                                  Apparently just a typical ethernet cable will suffice nowadays. Obviously the better your NIC the faster the transfer.



                                                                  To summarize, if any network setup is necessary, it should be limited to simply setting static IPs for your server and backup computer with a subnet mask 255.255.255.0



                                                                  Good luck!



                                                                  Edit:



                                                                  @Khrystoph touched on this in his answer







                                                                  share|improve this answer














                                                                  share|improve this answer



                                                                  share|improve this answer








                                                                  edited Sep 11 '15 at 19:56

























                                                                  answered Sep 8 '15 at 7:41







                                                                  user133156


















                                                                  • How will it improve speed rates? Can you please explain it your answer?
                                                                    – A.L
                                                                    Sep 11 '15 at 17:43







                                                                  • 1




                                                                    It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                                                                    – Dan Pritts
                                                                    Sep 11 '15 at 20:47
















                                                                  • How will it improve speed rates? Can you please explain it your answer?
                                                                    – A.L
                                                                    Sep 11 '15 at 17:43







                                                                  • 1




                                                                    It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                                                                    – Dan Pritts
                                                                    Sep 11 '15 at 20:47















                                                                  How will it improve speed rates? Can you please explain it your answer?
                                                                  – A.L
                                                                  Sep 11 '15 at 17:43





                                                                  How will it improve speed rates? Can you please explain it your answer?
                                                                  – A.L
                                                                  Sep 11 '15 at 17:43





                                                                  1




                                                                  1




                                                                  It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                                                                  – Dan Pritts
                                                                  Sep 11 '15 at 20:47




                                                                  It would potentially improve speed because you would not have to worry about the intermediate network slowing you down. Regarding "typical" vs "crossover" ethernet cables - 1Gb ethernet will auto-crossover as necessary. HP ethernet switches will do this at 100Mb. Other brands, generally not, and you'll need a crossover if you're stuck at 100Mb.
                                                                  – Dan Pritts
                                                                  Sep 11 '15 at 20:47










                                                                  up vote
                                                                  1
                                                                  down vote













                                                                  Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.



                                                                  If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.






                                                                  share|improve this answer


























                                                                    up vote
                                                                    1
                                                                    down vote













                                                                    Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.



                                                                    If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.






                                                                    share|improve this answer
























                                                                      up vote
                                                                      1
                                                                      down vote










                                                                      up vote
                                                                      1
                                                                      down vote









                                                                      Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.



                                                                      If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.






                                                                      share|improve this answer














                                                                      Several folks recommend that you skip ssh because encryption will slow you down. Modern CPUs may actually be fast enough at 1Gb, but OpenSSH has problems with its internal windowing implementation that can drastically slow you down.



                                                                      If you want to do this with ssh, take a look at HPN SSH. It solves the windowing problems and adds multithreaded encryption. Unfortunately you'll need to rebuild ssh on both the client & server.







                                                                      share|improve this answer














                                                                      share|improve this answer



                                                                      share|improve this answer








                                                                      edited Sep 14 '15 at 13:55

























                                                                      answered Sep 11 '15 at 20:54









                                                                      Dan Pritts

                                                                      43538




                                                                      43538




















                                                                          up vote
                                                                          0
                                                                          down vote













                                                                          OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.



                                                                          The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.



                                                                          performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):



                                                                          $ time bbcp 10G root@$dest_ip:/dev/null
                                                                          0m16.5s

                                                                          iperf:

                                                                          server: $ iperf3 -s -F /dev/null
                                                                          client:
                                                                          $ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
                                                                          0m13.44s
                                                                          (30% cpu)

                                                                          netcat (1.187 openbsd):

                                                                          server: $ nc -l 1234 > /dev/null
                                                                          client: $ time nc $dest_ip 1234 -q 0 < 10G
                                                                          0m13.311s
                                                                          (58% cpu)

                                                                          scp:

                                                                          $ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
                                                                          1m31.616s
                                                                          scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly):
                                                                          1m32.707s

                                                                          socat:

                                                                          server:
                                                                          $ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
                                                                          client:
                                                                          $ time socat -u FILE:10G TCP:$dest_ip:9876
                                                                          0m15.989s


                                                                          And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:



                                                                          nc: 0m18.706s (100% cpu, v1.84, no -q option
                                                                          iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
                                                                          socat: 0m10.293s (88% cpu, possibly maxed out)


                                                                          So on one instance netcat used less cpu, on the other socat, so YMMV.



                                                                          With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.



                                                                          What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.



                                                                          Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.



                                                                          Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)



                                                                          Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:



                                                                          http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
                                                                          possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/



                                                                          suggestions from linode, add to /etc/sysctl.conf:



                                                                          net.core.rmem_max = 268435456 
                                                                          net.core.wmem_max = 268435456
                                                                          net.ipv4.tcp_rmem = 4096 87380 134217728
                                                                          net.ipv4.tcp_wmem = 4096 65536 134217728
                                                                          net.core.netdev_max_backlog = 250000
                                                                          net.ipv4.tcp_no_metrics_save = 1
                                                                          net.core.default_qdisc = fq


                                                                          Additionally, they would like you to run:



                                                                           /sbin/ifconfig eth0 txqueuelen 10000 


                                                                          worth double checking after tweaking to make sure changes don't cause harm too.



                                                                          Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp



                                                                          With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.



                                                                          The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.






                                                                          share|improve this answer


























                                                                            up vote
                                                                            0
                                                                            down vote













                                                                            OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.



                                                                            The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.



                                                                            performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):



                                                                            $ time bbcp 10G root@$dest_ip:/dev/null
                                                                            0m16.5s

                                                                            iperf:

                                                                            server: $ iperf3 -s -F /dev/null
                                                                            client:
                                                                            $ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
                                                                            0m13.44s
                                                                            (30% cpu)

                                                                            netcat (1.187 openbsd):

                                                                            server: $ nc -l 1234 > /dev/null
                                                                            client: $ time nc $dest_ip 1234 -q 0 < 10G
                                                                            0m13.311s
                                                                            (58% cpu)

                                                                            scp:

                                                                            $ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
                                                                            1m31.616s
                                                                            scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly):
                                                                            1m32.707s

                                                                            socat:

                                                                            server:
                                                                            $ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
                                                                            client:
                                                                            $ time socat -u FILE:10G TCP:$dest_ip:9876
                                                                            0m15.989s


                                                                            And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:



                                                                            nc: 0m18.706s (100% cpu, v1.84, no -q option
                                                                            iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
                                                                            socat: 0m10.293s (88% cpu, possibly maxed out)


                                                                            So on one instance netcat used less cpu, on the other socat, so YMMV.



                                                                            With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.



                                                                            What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.



                                                                            Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.



                                                                            Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)



                                                                            Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:



                                                                            http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
                                                                            possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/



                                                                            suggestions from linode, add to /etc/sysctl.conf:



                                                                            net.core.rmem_max = 268435456 
                                                                            net.core.wmem_max = 268435456
                                                                            net.ipv4.tcp_rmem = 4096 87380 134217728
                                                                            net.ipv4.tcp_wmem = 4096 65536 134217728
                                                                            net.core.netdev_max_backlog = 250000
                                                                            net.ipv4.tcp_no_metrics_save = 1
                                                                            net.core.default_qdisc = fq


                                                                            Additionally, they would like you to run:



                                                                             /sbin/ifconfig eth0 txqueuelen 10000 


                                                                            worth double checking after tweaking to make sure changes don't cause harm too.



                                                                            Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp



                                                                            With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.



                                                                            The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.






                                                                            share|improve this answer
























                                                                              up vote
                                                                              0
                                                                              down vote










                                                                              up vote
                                                                              0
                                                                              down vote









                                                                              OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.



                                                                              The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.



                                                                              performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):



                                                                              $ time bbcp 10G root@$dest_ip:/dev/null
                                                                              0m16.5s

                                                                              iperf:

                                                                              server: $ iperf3 -s -F /dev/null
                                                                              client:
                                                                              $ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
                                                                              0m13.44s
                                                                              (30% cpu)

                                                                              netcat (1.187 openbsd):

                                                                              server: $ nc -l 1234 > /dev/null
                                                                              client: $ time nc $dest_ip 1234 -q 0 < 10G
                                                                              0m13.311s
                                                                              (58% cpu)

                                                                              scp:

                                                                              $ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
                                                                              1m31.616s
                                                                              scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly):
                                                                              1m32.707s

                                                                              socat:

                                                                              server:
                                                                              $ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
                                                                              client:
                                                                              $ time socat -u FILE:10G TCP:$dest_ip:9876
                                                                              0m15.989s


                                                                              And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:



                                                                              nc: 0m18.706s (100% cpu, v1.84, no -q option
                                                                              iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
                                                                              socat: 0m10.293s (88% cpu, possibly maxed out)


                                                                              So on one instance netcat used less cpu, on the other socat, so YMMV.



                                                                              With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.



                                                                              What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.



                                                                              Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.



                                                                              Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)



                                                                              Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:



                                                                              http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
                                                                              possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/



                                                                              suggestions from linode, add to /etc/sysctl.conf:



                                                                              net.core.rmem_max = 268435456 
                                                                              net.core.wmem_max = 268435456
                                                                              net.ipv4.tcp_rmem = 4096 87380 134217728
                                                                              net.ipv4.tcp_wmem = 4096 65536 134217728
                                                                              net.core.netdev_max_backlog = 250000
                                                                              net.ipv4.tcp_no_metrics_save = 1
                                                                              net.core.default_qdisc = fq


                                                                              Additionally, they would like you to run:



                                                                               /sbin/ifconfig eth0 txqueuelen 10000 


                                                                              worth double checking after tweaking to make sure changes don't cause harm too.



                                                                              Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp



                                                                              With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.



                                                                              The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.






                                                                              share|improve this answer














                                                                              OK I have attempted to answer this question for two computers with "very large pipes" (10Gbe) that are "close" to each other.



                                                                              The problem you run into here is: most compression will bottleneck at the cpu, since the pipes are so large.



                                                                              performance to transfer 10GB file (6 Gb network connection [linode], uncompressible data):



                                                                              $ time bbcp 10G root@$dest_ip:/dev/null
                                                                              0m16.5s

                                                                              iperf:

                                                                              server: $ iperf3 -s -F /dev/null
                                                                              client:
                                                                              $ time iperf3 -c $dest_ip -F 10G -t 20 # -t needs to be greater than time to transfer complete file
                                                                              0m13.44s
                                                                              (30% cpu)

                                                                              netcat (1.187 openbsd):

                                                                              server: $ nc -l 1234 > /dev/null
                                                                              client: $ time nc $dest_ip 1234 -q 0 < 10G
                                                                              0m13.311s
                                                                              (58% cpu)

                                                                              scp:

                                                                              $ time /usr/local/bin/scp 10G root@$dest_ip:/dev/null
                                                                              1m31.616s
                                                                              scp with hpn ssh patch (scp -- hpn patch on client only, so not a good test possibly):
                                                                              1m32.707s

                                                                              socat:

                                                                              server:
                                                                              $ socat -u TCP-LISTEN:9876,reuseaddr OPEN:/dev/null,creat,trunc
                                                                              client:
                                                                              $ time socat -u FILE:10G TCP:$dest_ip:9876
                                                                              0m15.989s


                                                                              And two boxes on 10 Gbe, slightly older versions of netcat (CentOs 6.7), 10GB file:



                                                                              nc: 0m18.706s (100% cpu, v1.84, no -q option
                                                                              iperf3: 0m10.013s (100% cpu, but can go up to at least 20Gbe with 100% cpu so not sure it matters)
                                                                              socat: 0m10.293s (88% cpu, possibly maxed out)


                                                                              So on one instance netcat used less cpu, on the other socat, so YMMV.



                                                                              With netcat, if it doesn't have a "-N -q 0" option it can transfer truncated files, be careful...other options like "-w 10" can also result in truncated files.



                                                                              What is happening in almost all of these cases is the cpu is being maxed out, not the network. scp maxes out at about 230 MB/s, pegging one core at 100% utilization.



                                                                              Iperf3 unfortunately creates corrupted files. Some versions of netcat seem to not transfer the entire file, very weird. Especially older versions of it.



                                                                              Various incantations of "gzip as a pipe to netcat" or "mbuffer" also seemed to max out the cpu with the gzip or mbuffer, so didn't result in faster transfer with such large pipes. lz4 might help. In addition, some of the gzip pipe stuff I attempted resulted in corrupted transfers for very large (> 4 GB) files so be careful out there :)



                                                                              Another thing that might work especially for higher latency (?) is to tune tcp settings. Here is a guide that mention suggested values:



                                                                              http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm and https://fasterdata.es.net/host-tuning/linux/ (from another answer)
                                                                              possibly IRQ settings: https://fasterdata.es.net/host-tuning/100g-tuning/



                                                                              suggestions from linode, add to /etc/sysctl.conf:



                                                                              net.core.rmem_max = 268435456 
                                                                              net.core.wmem_max = 268435456
                                                                              net.ipv4.tcp_rmem = 4096 87380 134217728
                                                                              net.ipv4.tcp_wmem = 4096 65536 134217728
                                                                              net.core.netdev_max_backlog = 250000
                                                                              net.ipv4.tcp_no_metrics_save = 1
                                                                              net.core.default_qdisc = fq


                                                                              Additionally, they would like you to run:



                                                                               /sbin/ifconfig eth0 txqueuelen 10000 


                                                                              worth double checking after tweaking to make sure changes don't cause harm too.



                                                                              Also may be worth tuning the window size: https://iperf.fr/iperf-doc.php#tuningtcp



                                                                              With slow(er) connections compression can definitely help though. If you have big pipes, very fast compression might help with readily compressible data, haven't tried it.



                                                                              The standard answer for "syncing hard drives" is to rsync the files, that avoids transfer where possible.







                                                                              share|improve this answer














                                                                              share|improve this answer



                                                                              share|improve this answer








                                                                              edited Oct 2 at 15:55

























                                                                              answered Sep 27 at 22:40









                                                                              rogerdpack

                                                                              2711312




                                                                              2711312












                                                                                  Popular posts from this blog

                                                                                  How to check contact read email or not when send email to Individual?

                                                                                  Displaying single band from multi-band raster using QGIS

                                                                                  How many registers does an x86_64 CPU actually have?