How do I make ppp reliable over lossy radio modems using pppd and tcp kernel settings on debian?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I am having a lot of trouble establishing a reliable ppp / tcpip link between two debian systems over radio modems.



My hardware topology is a little complex.



The system uses:



  • Raspberry Pi 3B at each end of the radio link running raspbian
    stretch (RPI)

  • RFDesign RFD900x radio modems (connected via FTDI cable
    via USB to the RPIs) (RFD900)

  • A linksys wifi router that it NATing
    (WIFI) To a satelite service (SkyMuster - Australia) to an unknown
    POP in AUS to the Internet (SAT)

  • A vpn (vpnc) over the SAT to another
    Aus ISP static IP terminted by a router. (which is the default route
    for the RPI3Bs)

  • (VPN) The vpn endpoint is connected to the net with a
    static ip (END)

I beleive the problem to be over the RFD900x modems, related to TCP congestion fall backs that occur when the radio drops packets, though I provide the other details for context and in case I'm missing somthing silly.



The issues are reproducable between the RPIs over the RFD900.



From the end-point (with the most trouble) the link to the Internet is as follows:



RPI -> RFD900 -> RFD900 -> RPI -> VPN -> WIFI -> SAT -> END.



Again the above for context.



The RFD900s drop packets a lot with the distance and obsticles involved. I have tried all sorts of aerial configurations to no avail (omni, yaggi, direct vs bouncing off granite cliffs). I have tried tuning all sorts of parameters in the modems, mtu, ppp settings etc to acheive TCP/IP reliability to no avail.



Air speed is 64kb. Serial speed is 57kb.



Diag notes:



  • On simple serial to serial comms over the RFD900 at various distances
    the radio MTU size of 131 bytes or 151 bytes has best throughput.

  • The throughput is reliable, though is "bursty" ->
    burst, burst, burst, rather than continuous flow.

  • I suspect this "burstiness" is a function of TCP seeing the radio packet dropouts as congestion which progresses to an inevitable retry saturation.

  • When it saturates, sessions (ssh, scp, apt etc) just seem to freeze for
    variable amounts of extended time (seconds, often 2-3 minutes,
    sometimes > 10 minutes).

  • apt will generally fail. scp and ssh tend to keep on going and get there eventually, though usually with multiple stalls and crazy delay times.

  • Interactively over ssh, the link is usable, providing that no long responses are involved - eg a long ls -la.

  • Flow control to the modems (none, RTSCTS, XONXOFF) seem inconsequential to my tests.

  • Different forms of ppp payload compression seem inconsequential (BSD, Predictor, deflate etc).

  • Van Jacobsen header compression increases the throughput per burst, but exacerbates the stalls and delays

  • I've searched extensively for solutions (even going back and reading the RFCs).

  • It seems that VJ header comp was identified as problematic for lossy links and there have been RFC advances on compression techniques eg ROHC - RObust Header Compression, including a ROHC work-group from which seems to have emerged various proprietary compression protocols that are not available in open source.

  • The problem seems well-solved for cellular links (both with ppp and RLP) - which rely on proprietary protocols.

I also post here my current script which runs pppd (including the various options I've tried - see #commented lines.):



# set up the pppd to allow the remote unit to connect as an IP peer
# additional options for remote end: usepeerdns defaultroute replacedefultroute

pppd /dev/ttyUSB0 57600 mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 192.168.10.1:192.168.10.2 mru 131 mtu 131 proxyarp noauth crtscts nocdtrcts noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp lock mru 131 mtu 131 passive local maxfail 0 persist updetach

#debug ktune bsdcomp 12 xonxoff nocrtscts mru 296 mtu 296
#pppd /dev/ttyUSB0 57600 debug mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist updetach proxyarp

#pppd /dev/ttyUSB0 57600 noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach


Has anyone solved this with open source pppd? Are there other options or technologies which would be an alternative?



Are kernel TCP congestion settings worth looking into?







share|improve this question





















  • You seem to have tested ppp options thoroughly. The "saturation" behavior suggests bufferbloat (google), so have a look at tc and Linux Traffic Control tutorials.
    – dirkt
    May 9 at 10:21






  • 1




    Were you looking over my shoulder :) Yes deep in that now. I've had some success with tcp_bbr congestion control and qdisc set to fair queuing. Will report back once I've done some more testing.
    – BrendanMcL
    May 9 at 11:23














up vote
2
down vote

favorite












I am having a lot of trouble establishing a reliable ppp / tcpip link between two debian systems over radio modems.



My hardware topology is a little complex.



The system uses:



  • Raspberry Pi 3B at each end of the radio link running raspbian
    stretch (RPI)

  • RFDesign RFD900x radio modems (connected via FTDI cable
    via USB to the RPIs) (RFD900)

  • A linksys wifi router that it NATing
    (WIFI) To a satelite service (SkyMuster - Australia) to an unknown
    POP in AUS to the Internet (SAT)

  • A vpn (vpnc) over the SAT to another
    Aus ISP static IP terminted by a router. (which is the default route
    for the RPI3Bs)

  • (VPN) The vpn endpoint is connected to the net with a
    static ip (END)

I beleive the problem to be over the RFD900x modems, related to TCP congestion fall backs that occur when the radio drops packets, though I provide the other details for context and in case I'm missing somthing silly.



The issues are reproducable between the RPIs over the RFD900.



From the end-point (with the most trouble) the link to the Internet is as follows:



RPI -> RFD900 -> RFD900 -> RPI -> VPN -> WIFI -> SAT -> END.



Again the above for context.



The RFD900s drop packets a lot with the distance and obsticles involved. I have tried all sorts of aerial configurations to no avail (omni, yaggi, direct vs bouncing off granite cliffs). I have tried tuning all sorts of parameters in the modems, mtu, ppp settings etc to acheive TCP/IP reliability to no avail.



Air speed is 64kb. Serial speed is 57kb.



Diag notes:



  • On simple serial to serial comms over the RFD900 at various distances
    the radio MTU size of 131 bytes or 151 bytes has best throughput.

  • The throughput is reliable, though is "bursty" ->
    burst, burst, burst, rather than continuous flow.

  • I suspect this "burstiness" is a function of TCP seeing the radio packet dropouts as congestion which progresses to an inevitable retry saturation.

  • When it saturates, sessions (ssh, scp, apt etc) just seem to freeze for
    variable amounts of extended time (seconds, often 2-3 minutes,
    sometimes > 10 minutes).

  • apt will generally fail. scp and ssh tend to keep on going and get there eventually, though usually with multiple stalls and crazy delay times.

  • Interactively over ssh, the link is usable, providing that no long responses are involved - eg a long ls -la.

  • Flow control to the modems (none, RTSCTS, XONXOFF) seem inconsequential to my tests.

  • Different forms of ppp payload compression seem inconsequential (BSD, Predictor, deflate etc).

  • Van Jacobsen header compression increases the throughput per burst, but exacerbates the stalls and delays

  • I've searched extensively for solutions (even going back and reading the RFCs).

  • It seems that VJ header comp was identified as problematic for lossy links and there have been RFC advances on compression techniques eg ROHC - RObust Header Compression, including a ROHC work-group from which seems to have emerged various proprietary compression protocols that are not available in open source.

  • The problem seems well-solved for cellular links (both with ppp and RLP) - which rely on proprietary protocols.

I also post here my current script which runs pppd (including the various options I've tried - see #commented lines.):



# set up the pppd to allow the remote unit to connect as an IP peer
# additional options for remote end: usepeerdns defaultroute replacedefultroute

pppd /dev/ttyUSB0 57600 mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 192.168.10.1:192.168.10.2 mru 131 mtu 131 proxyarp noauth crtscts nocdtrcts noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp lock mru 131 mtu 131 passive local maxfail 0 persist updetach

#debug ktune bsdcomp 12 xonxoff nocrtscts mru 296 mtu 296
#pppd /dev/ttyUSB0 57600 debug mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist updetach proxyarp

#pppd /dev/ttyUSB0 57600 noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach


Has anyone solved this with open source pppd? Are there other options or technologies which would be an alternative?



Are kernel TCP congestion settings worth looking into?







share|improve this question





















  • You seem to have tested ppp options thoroughly. The "saturation" behavior suggests bufferbloat (google), so have a look at tc and Linux Traffic Control tutorials.
    – dirkt
    May 9 at 10:21






  • 1




    Were you looking over my shoulder :) Yes deep in that now. I've had some success with tcp_bbr congestion control and qdisc set to fair queuing. Will report back once I've done some more testing.
    – BrendanMcL
    May 9 at 11:23












up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am having a lot of trouble establishing a reliable ppp / tcpip link between two debian systems over radio modems.



My hardware topology is a little complex.



The system uses:



  • Raspberry Pi 3B at each end of the radio link running raspbian
    stretch (RPI)

  • RFDesign RFD900x radio modems (connected via FTDI cable
    via USB to the RPIs) (RFD900)

  • A linksys wifi router that it NATing
    (WIFI) To a satelite service (SkyMuster - Australia) to an unknown
    POP in AUS to the Internet (SAT)

  • A vpn (vpnc) over the SAT to another
    Aus ISP static IP terminted by a router. (which is the default route
    for the RPI3Bs)

  • (VPN) The vpn endpoint is connected to the net with a
    static ip (END)

I beleive the problem to be over the RFD900x modems, related to TCP congestion fall backs that occur when the radio drops packets, though I provide the other details for context and in case I'm missing somthing silly.



The issues are reproducable between the RPIs over the RFD900.



From the end-point (with the most trouble) the link to the Internet is as follows:



RPI -> RFD900 -> RFD900 -> RPI -> VPN -> WIFI -> SAT -> END.



Again the above for context.



The RFD900s drop packets a lot with the distance and obsticles involved. I have tried all sorts of aerial configurations to no avail (omni, yaggi, direct vs bouncing off granite cliffs). I have tried tuning all sorts of parameters in the modems, mtu, ppp settings etc to acheive TCP/IP reliability to no avail.



Air speed is 64kb. Serial speed is 57kb.



Diag notes:



  • On simple serial to serial comms over the RFD900 at various distances
    the radio MTU size of 131 bytes or 151 bytes has best throughput.

  • The throughput is reliable, though is "bursty" ->
    burst, burst, burst, rather than continuous flow.

  • I suspect this "burstiness" is a function of TCP seeing the radio packet dropouts as congestion which progresses to an inevitable retry saturation.

  • When it saturates, sessions (ssh, scp, apt etc) just seem to freeze for
    variable amounts of extended time (seconds, often 2-3 minutes,
    sometimes > 10 minutes).

  • apt will generally fail. scp and ssh tend to keep on going and get there eventually, though usually with multiple stalls and crazy delay times.

  • Interactively over ssh, the link is usable, providing that no long responses are involved - eg a long ls -la.

  • Flow control to the modems (none, RTSCTS, XONXOFF) seem inconsequential to my tests.

  • Different forms of ppp payload compression seem inconsequential (BSD, Predictor, deflate etc).

  • Van Jacobsen header compression increases the throughput per burst, but exacerbates the stalls and delays

  • I've searched extensively for solutions (even going back and reading the RFCs).

  • It seems that VJ header comp was identified as problematic for lossy links and there have been RFC advances on compression techniques eg ROHC - RObust Header Compression, including a ROHC work-group from which seems to have emerged various proprietary compression protocols that are not available in open source.

  • The problem seems well-solved for cellular links (both with ppp and RLP) - which rely on proprietary protocols.

I also post here my current script which runs pppd (including the various options I've tried - see #commented lines.):



# set up the pppd to allow the remote unit to connect as an IP peer
# additional options for remote end: usepeerdns defaultroute replacedefultroute

pppd /dev/ttyUSB0 57600 mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 192.168.10.1:192.168.10.2 mru 131 mtu 131 proxyarp noauth crtscts nocdtrcts noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp lock mru 131 mtu 131 passive local maxfail 0 persist updetach

#debug ktune bsdcomp 12 xonxoff nocrtscts mru 296 mtu 296
#pppd /dev/ttyUSB0 57600 debug mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist updetach proxyarp

#pppd /dev/ttyUSB0 57600 noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach


Has anyone solved this with open source pppd? Are there other options or technologies which would be an alternative?



Are kernel TCP congestion settings worth looking into?







share|improve this question













I am having a lot of trouble establishing a reliable ppp / tcpip link between two debian systems over radio modems.



My hardware topology is a little complex.



The system uses:



  • Raspberry Pi 3B at each end of the radio link running raspbian
    stretch (RPI)

  • RFDesign RFD900x radio modems (connected via FTDI cable
    via USB to the RPIs) (RFD900)

  • A linksys wifi router that it NATing
    (WIFI) To a satelite service (SkyMuster - Australia) to an unknown
    POP in AUS to the Internet (SAT)

  • A vpn (vpnc) over the SAT to another
    Aus ISP static IP terminted by a router. (which is the default route
    for the RPI3Bs)

  • (VPN) The vpn endpoint is connected to the net with a
    static ip (END)

I beleive the problem to be over the RFD900x modems, related to TCP congestion fall backs that occur when the radio drops packets, though I provide the other details for context and in case I'm missing somthing silly.



The issues are reproducable between the RPIs over the RFD900.



From the end-point (with the most trouble) the link to the Internet is as follows:



RPI -> RFD900 -> RFD900 -> RPI -> VPN -> WIFI -> SAT -> END.



Again the above for context.



The RFD900s drop packets a lot with the distance and obsticles involved. I have tried all sorts of aerial configurations to no avail (omni, yaggi, direct vs bouncing off granite cliffs). I have tried tuning all sorts of parameters in the modems, mtu, ppp settings etc to acheive TCP/IP reliability to no avail.



Air speed is 64kb. Serial speed is 57kb.



Diag notes:



  • On simple serial to serial comms over the RFD900 at various distances
    the radio MTU size of 131 bytes or 151 bytes has best throughput.

  • The throughput is reliable, though is "bursty" ->
    burst, burst, burst, rather than continuous flow.

  • I suspect this "burstiness" is a function of TCP seeing the radio packet dropouts as congestion which progresses to an inevitable retry saturation.

  • When it saturates, sessions (ssh, scp, apt etc) just seem to freeze for
    variable amounts of extended time (seconds, often 2-3 minutes,
    sometimes > 10 minutes).

  • apt will generally fail. scp and ssh tend to keep on going and get there eventually, though usually with multiple stalls and crazy delay times.

  • Interactively over ssh, the link is usable, providing that no long responses are involved - eg a long ls -la.

  • Flow control to the modems (none, RTSCTS, XONXOFF) seem inconsequential to my tests.

  • Different forms of ppp payload compression seem inconsequential (BSD, Predictor, deflate etc).

  • Van Jacobsen header compression increases the throughput per burst, but exacerbates the stalls and delays

  • I've searched extensively for solutions (even going back and reading the RFCs).

  • It seems that VJ header comp was identified as problematic for lossy links and there have been RFC advances on compression techniques eg ROHC - RObust Header Compression, including a ROHC work-group from which seems to have emerged various proprietary compression protocols that are not available in open source.

  • The problem seems well-solved for cellular links (both with ppp and RLP) - which rely on proprietary protocols.

I also post here my current script which runs pppd (including the various options I've tried - see #commented lines.):



# set up the pppd to allow the remote unit to connect as an IP peer
# additional options for remote end: usepeerdns defaultroute replacedefultroute

pppd /dev/ttyUSB0 57600 mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 192.168.10.1:192.168.10.2 mru 131 mtu 131 proxyarp noauth crtscts nocdtrcts noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp lock mru 131 mtu 131 passive local maxfail 0 persist updetach

#debug ktune bsdcomp 12 xonxoff nocrtscts mru 296 mtu 296
#pppd /dev/ttyUSB0 57600 debug mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist updetach proxyarp

#pppd /dev/ttyUSB0 57600 noaccomp nobsdcomp nodeflate nopcomp nopredictor1 novj novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach

#pppd /dev/ttyUSB0 57600 novjccomp mru 131 mtu 131 noauth crtscts nocdtrcts lock passive 192.168.10.1:192.168.10.2 local maxfail 0 persist proxyarp updetach


Has anyone solved this with open source pppd? Are there other options or technologies which would be an alternative?



Are kernel TCP congestion settings worth looking into?









share|improve this question












share|improve this question




share|improve this question








edited May 9 at 8:23
























asked May 9 at 1:51









BrendanMcL

214




214











  • You seem to have tested ppp options thoroughly. The "saturation" behavior suggests bufferbloat (google), so have a look at tc and Linux Traffic Control tutorials.
    – dirkt
    May 9 at 10:21






  • 1




    Were you looking over my shoulder :) Yes deep in that now. I've had some success with tcp_bbr congestion control and qdisc set to fair queuing. Will report back once I've done some more testing.
    – BrendanMcL
    May 9 at 11:23
















  • You seem to have tested ppp options thoroughly. The "saturation" behavior suggests bufferbloat (google), so have a look at tc and Linux Traffic Control tutorials.
    – dirkt
    May 9 at 10:21






  • 1




    Were you looking over my shoulder :) Yes deep in that now. I've had some success with tcp_bbr congestion control and qdisc set to fair queuing. Will report back once I've done some more testing.
    – BrendanMcL
    May 9 at 11:23















You seem to have tested ppp options thoroughly. The "saturation" behavior suggests bufferbloat (google), so have a look at tc and Linux Traffic Control tutorials.
– dirkt
May 9 at 10:21




You seem to have tested ppp options thoroughly. The "saturation" behavior suggests bufferbloat (google), so have a look at tc and Linux Traffic Control tutorials.
– dirkt
May 9 at 10:21




1




1




Were you looking over my shoulder :) Yes deep in that now. I've had some success with tcp_bbr congestion control and qdisc set to fair queuing. Will report back once I've done some more testing.
– BrendanMcL
May 9 at 11:23




Were you looking over my shoulder :) Yes deep in that now. I've had some success with tcp_bbr congestion control and qdisc set to fair queuing. Will report back once I've done some more testing.
– BrendanMcL
May 9 at 11:23










1 Answer
1






active

oldest

votes

















up vote
1
down vote













As a partial answer to my own question, I have made significant reliability improvements through the following:



Changing the tcp_congestion_control kernel plugin from the default cubic to bbr has significantly addressed bufferbloat that along with a lossy radio connection is at the core of the issue.



This required loading of the tcp_bbr kernel module and also changing the net.core queuing discipline model to fair queuing to provide pacing for the bbr module.



On the RPIs the defaults are:



net.ipv4.tcp_congestion_control=cubic



net.core.default_qdisc=pfifo_fast



The commands to change this at run-time are:



modprobe tcp_bbr
sysctl -w net.core.default_qdisc=fq
sysctl -w net.ipv4.tcp_congestion_control=bbr


I presently run these from a script called from /etc/rc.local. They can easily be made permanent with modifications to modprode.d and sysctl.conf or as a file in sysctl.d.



The result is a far more smooth response over ssh and far more reliable bulk transfer, which while still stalling manages to recover quickly and complete reliably, returning to the command prompt immediately (rather than pausing at 100% for extended periods - eg 1 - 3 minutes before returning as was the case with the cubic congestion control).



The trade-off is overall speed, however reliability is more important. For example the transfer of a 283k file across the radio link using scp now results in:



100% 283KB 2.5KB/s 01:51


This is a compromise I am happy enough with for now.
However, long running bulk transfer processes are still problematic and eventually stall and never complete.



For example, apt-get update, running for over an hour (stalling on the 11.7MB file), requires occasional carriage returns through the ssh terminal to continue running, and eventually do bog down to a very extended latency, though not failing all together.



In the following screen scrape, the process was 1 hour plus, with a few CRs every 10-15 mins and a delay of approximately 5 mins between when ^C was sent vs the terminal responding:



root@priotdev2:~# apt-get update
Get:1 http://mirror.internode.on.net/pub/raspbian/raspbian stretch InRelease [15.0 kB]
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
Get:4 http://archive.raspberrypi.org/debian stretch/main armhf Packages [145 kB]
Get:5 http://archive.raspberrypi.org/debian stretch/ui armhf Packages [30.7 kB]
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
23% [3 Packages 270 kB/11.7 MB 2%] 2,864 B/s 1h 6min 15s
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
29% [3 Packages 1,263 kB/11.7 MB 11%] 131 B/s 22h 2min 19s
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
60% [3 Packages 5,883 kB/11.7 MB 50%] 16 B/s 4d 4h 13min 58s
60% [3 Packages 5,902 kB/11.7 MB 51%] 1,531 B/s 1h 2min 38s
66% [3 Packages 6,704 kB/11.7 MB 58%]

66% [3 Packages 6,704 kB/11.7 MB 58%]
Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
66% [3 Packages 6,735 kB/11.7 MB 58%]
66% [3 Packages 6,735 kB/11.7 MB 58%]
66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
66% [3 Packages 6,745 kB/11.7 MB 58%]
66% [3 Packages 6,745 kB/11.7 MB 58%]
66% [3 Packages 6,746 kB/11.7 MB 58%] 230 B/s 5h 55min 46s

66% [3 Packages 6,747 kB/11.7 MB 58%] 148 B/s 9h 12min 47s^C
root@priotdev2:~# ^C
root@priotdev2:~#
root@priotdev2:~#


Scrolling to the right on the above dump can be seen the abysmal through put (down to 16 and 32 bytes per second in some cases).



To remove the end-to-end variables that also involve the satellite link, the apt process is actually using the upstream RPI as an apt cache (which is up to date), the transfers only represent traffic over the radio link.



Any insights from the community on further improvements would be most welcome.






share|improve this answer























    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f442668%2fhow-do-i-make-ppp-reliable-over-lossy-radio-modems-using-pppd-and-tcp-kernel-set%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    As a partial answer to my own question, I have made significant reliability improvements through the following:



    Changing the tcp_congestion_control kernel plugin from the default cubic to bbr has significantly addressed bufferbloat that along with a lossy radio connection is at the core of the issue.



    This required loading of the tcp_bbr kernel module and also changing the net.core queuing discipline model to fair queuing to provide pacing for the bbr module.



    On the RPIs the defaults are:



    net.ipv4.tcp_congestion_control=cubic



    net.core.default_qdisc=pfifo_fast



    The commands to change this at run-time are:



    modprobe tcp_bbr
    sysctl -w net.core.default_qdisc=fq
    sysctl -w net.ipv4.tcp_congestion_control=bbr


    I presently run these from a script called from /etc/rc.local. They can easily be made permanent with modifications to modprode.d and sysctl.conf or as a file in sysctl.d.



    The result is a far more smooth response over ssh and far more reliable bulk transfer, which while still stalling manages to recover quickly and complete reliably, returning to the command prompt immediately (rather than pausing at 100% for extended periods - eg 1 - 3 minutes before returning as was the case with the cubic congestion control).



    The trade-off is overall speed, however reliability is more important. For example the transfer of a 283k file across the radio link using scp now results in:



    100% 283KB 2.5KB/s 01:51


    This is a compromise I am happy enough with for now.
    However, long running bulk transfer processes are still problematic and eventually stall and never complete.



    For example, apt-get update, running for over an hour (stalling on the 11.7MB file), requires occasional carriage returns through the ssh terminal to continue running, and eventually do bog down to a very extended latency, though not failing all together.



    In the following screen scrape, the process was 1 hour plus, with a few CRs every 10-15 mins and a delay of approximately 5 mins between when ^C was sent vs the terminal responding:



    root@priotdev2:~# apt-get update
    Get:1 http://mirror.internode.on.net/pub/raspbian/raspbian stretch InRelease [15.0 kB]
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
    Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
    Get:4 http://archive.raspberrypi.org/debian stretch/main armhf Packages [145 kB]
    Get:5 http://archive.raspberrypi.org/debian stretch/ui armhf Packages [30.7 kB]
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    23% [3 Packages 270 kB/11.7 MB 2%] 2,864 B/s 1h 6min 15s
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    29% [3 Packages 1,263 kB/11.7 MB 11%] 131 B/s 22h 2min 19s
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    60% [3 Packages 5,883 kB/11.7 MB 50%] 16 B/s 4d 4h 13min 58s
    60% [3 Packages 5,902 kB/11.7 MB 51%] 1,531 B/s 1h 2min 38s
    66% [3 Packages 6,704 kB/11.7 MB 58%]

    66% [3 Packages 6,704 kB/11.7 MB 58%]
    Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
    66% [3 Packages 6,735 kB/11.7 MB 58%]
    66% [3 Packages 6,735 kB/11.7 MB 58%]
    66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
    66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
    66% [3 Packages 6,745 kB/11.7 MB 58%]
    66% [3 Packages 6,745 kB/11.7 MB 58%]
    66% [3 Packages 6,746 kB/11.7 MB 58%] 230 B/s 5h 55min 46s

    66% [3 Packages 6,747 kB/11.7 MB 58%] 148 B/s 9h 12min 47s^C
    root@priotdev2:~# ^C
    root@priotdev2:~#
    root@priotdev2:~#


    Scrolling to the right on the above dump can be seen the abysmal through put (down to 16 and 32 bytes per second in some cases).



    To remove the end-to-end variables that also involve the satellite link, the apt process is actually using the upstream RPI as an apt cache (which is up to date), the transfers only represent traffic over the radio link.



    Any insights from the community on further improvements would be most welcome.






    share|improve this answer



























      up vote
      1
      down vote













      As a partial answer to my own question, I have made significant reliability improvements through the following:



      Changing the tcp_congestion_control kernel plugin from the default cubic to bbr has significantly addressed bufferbloat that along with a lossy radio connection is at the core of the issue.



      This required loading of the tcp_bbr kernel module and also changing the net.core queuing discipline model to fair queuing to provide pacing for the bbr module.



      On the RPIs the defaults are:



      net.ipv4.tcp_congestion_control=cubic



      net.core.default_qdisc=pfifo_fast



      The commands to change this at run-time are:



      modprobe tcp_bbr
      sysctl -w net.core.default_qdisc=fq
      sysctl -w net.ipv4.tcp_congestion_control=bbr


      I presently run these from a script called from /etc/rc.local. They can easily be made permanent with modifications to modprode.d and sysctl.conf or as a file in sysctl.d.



      The result is a far more smooth response over ssh and far more reliable bulk transfer, which while still stalling manages to recover quickly and complete reliably, returning to the command prompt immediately (rather than pausing at 100% for extended periods - eg 1 - 3 minutes before returning as was the case with the cubic congestion control).



      The trade-off is overall speed, however reliability is more important. For example the transfer of a 283k file across the radio link using scp now results in:



      100% 283KB 2.5KB/s 01:51


      This is a compromise I am happy enough with for now.
      However, long running bulk transfer processes are still problematic and eventually stall and never complete.



      For example, apt-get update, running for over an hour (stalling on the 11.7MB file), requires occasional carriage returns through the ssh terminal to continue running, and eventually do bog down to a very extended latency, though not failing all together.



      In the following screen scrape, the process was 1 hour plus, with a few CRs every 10-15 mins and a delay of approximately 5 mins between when ^C was sent vs the terminal responding:



      root@priotdev2:~# apt-get update
      Get:1 http://mirror.internode.on.net/pub/raspbian/raspbian stretch InRelease [15.0 kB]
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
      Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
      Get:4 http://archive.raspberrypi.org/debian stretch/main armhf Packages [145 kB]
      Get:5 http://archive.raspberrypi.org/debian stretch/ui armhf Packages [30.7 kB]
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      23% [3 Packages 270 kB/11.7 MB 2%] 2,864 B/s 1h 6min 15s
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      29% [3 Packages 1,263 kB/11.7 MB 11%] 131 B/s 22h 2min 19s
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      60% [3 Packages 5,883 kB/11.7 MB 50%] 16 B/s 4d 4h 13min 58s
      60% [3 Packages 5,902 kB/11.7 MB 51%] 1,531 B/s 1h 2min 38s
      66% [3 Packages 6,704 kB/11.7 MB 58%]

      66% [3 Packages 6,704 kB/11.7 MB 58%]
      Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
      66% [3 Packages 6,735 kB/11.7 MB 58%]
      66% [3 Packages 6,735 kB/11.7 MB 58%]
      66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
      66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
      66% [3 Packages 6,745 kB/11.7 MB 58%]
      66% [3 Packages 6,745 kB/11.7 MB 58%]
      66% [3 Packages 6,746 kB/11.7 MB 58%] 230 B/s 5h 55min 46s

      66% [3 Packages 6,747 kB/11.7 MB 58%] 148 B/s 9h 12min 47s^C
      root@priotdev2:~# ^C
      root@priotdev2:~#
      root@priotdev2:~#


      Scrolling to the right on the above dump can be seen the abysmal through put (down to 16 and 32 bytes per second in some cases).



      To remove the end-to-end variables that also involve the satellite link, the apt process is actually using the upstream RPI as an apt cache (which is up to date), the transfers only represent traffic over the radio link.



      Any insights from the community on further improvements would be most welcome.






      share|improve this answer

























        up vote
        1
        down vote










        up vote
        1
        down vote









        As a partial answer to my own question, I have made significant reliability improvements through the following:



        Changing the tcp_congestion_control kernel plugin from the default cubic to bbr has significantly addressed bufferbloat that along with a lossy radio connection is at the core of the issue.



        This required loading of the tcp_bbr kernel module and also changing the net.core queuing discipline model to fair queuing to provide pacing for the bbr module.



        On the RPIs the defaults are:



        net.ipv4.tcp_congestion_control=cubic



        net.core.default_qdisc=pfifo_fast



        The commands to change this at run-time are:



        modprobe tcp_bbr
        sysctl -w net.core.default_qdisc=fq
        sysctl -w net.ipv4.tcp_congestion_control=bbr


        I presently run these from a script called from /etc/rc.local. They can easily be made permanent with modifications to modprode.d and sysctl.conf or as a file in sysctl.d.



        The result is a far more smooth response over ssh and far more reliable bulk transfer, which while still stalling manages to recover quickly and complete reliably, returning to the command prompt immediately (rather than pausing at 100% for extended periods - eg 1 - 3 minutes before returning as was the case with the cubic congestion control).



        The trade-off is overall speed, however reliability is more important. For example the transfer of a 283k file across the radio link using scp now results in:



        100% 283KB 2.5KB/s 01:51


        This is a compromise I am happy enough with for now.
        However, long running bulk transfer processes are still problematic and eventually stall and never complete.



        For example, apt-get update, running for over an hour (stalling on the 11.7MB file), requires occasional carriage returns through the ssh terminal to continue running, and eventually do bog down to a very extended latency, though not failing all together.



        In the following screen scrape, the process was 1 hour plus, with a few CRs every 10-15 mins and a delay of approximately 5 mins between when ^C was sent vs the terminal responding:



        root@priotdev2:~# apt-get update
        Get:1 http://mirror.internode.on.net/pub/raspbian/raspbian stretch InRelease [15.0 kB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
        Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
        Get:4 http://archive.raspberrypi.org/debian stretch/main armhf Packages [145 kB]
        Get:5 http://archive.raspberrypi.org/debian stretch/ui armhf Packages [30.7 kB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        23% [3 Packages 270 kB/11.7 MB 2%] 2,864 B/s 1h 6min 15s
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        29% [3 Packages 1,263 kB/11.7 MB 11%] 131 B/s 22h 2min 19s
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        60% [3 Packages 5,883 kB/11.7 MB 50%] 16 B/s 4d 4h 13min 58s
        60% [3 Packages 5,902 kB/11.7 MB 51%] 1,531 B/s 1h 2min 38s
        66% [3 Packages 6,704 kB/11.7 MB 58%]

        66% [3 Packages 6,704 kB/11.7 MB 58%]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        66% [3 Packages 6,735 kB/11.7 MB 58%]
        66% [3 Packages 6,735 kB/11.7 MB 58%]
        66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
        66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
        66% [3 Packages 6,745 kB/11.7 MB 58%]
        66% [3 Packages 6,745 kB/11.7 MB 58%]
        66% [3 Packages 6,746 kB/11.7 MB 58%] 230 B/s 5h 55min 46s

        66% [3 Packages 6,747 kB/11.7 MB 58%] 148 B/s 9h 12min 47s^C
        root@priotdev2:~# ^C
        root@priotdev2:~#
        root@priotdev2:~#


        Scrolling to the right on the above dump can be seen the abysmal through put (down to 16 and 32 bytes per second in some cases).



        To remove the end-to-end variables that also involve the satellite link, the apt process is actually using the upstream RPI as an apt cache (which is up to date), the transfers only represent traffic over the radio link.



        Any insights from the community on further improvements would be most welcome.






        share|improve this answer















        As a partial answer to my own question, I have made significant reliability improvements through the following:



        Changing the tcp_congestion_control kernel plugin from the default cubic to bbr has significantly addressed bufferbloat that along with a lossy radio connection is at the core of the issue.



        This required loading of the tcp_bbr kernel module and also changing the net.core queuing discipline model to fair queuing to provide pacing for the bbr module.



        On the RPIs the defaults are:



        net.ipv4.tcp_congestion_control=cubic



        net.core.default_qdisc=pfifo_fast



        The commands to change this at run-time are:



        modprobe tcp_bbr
        sysctl -w net.core.default_qdisc=fq
        sysctl -w net.ipv4.tcp_congestion_control=bbr


        I presently run these from a script called from /etc/rc.local. They can easily be made permanent with modifications to modprode.d and sysctl.conf or as a file in sysctl.d.



        The result is a far more smooth response over ssh and far more reliable bulk transfer, which while still stalling manages to recover quickly and complete reliably, returning to the command prompt immediately (rather than pausing at 100% for extended periods - eg 1 - 3 minutes before returning as was the case with the cubic congestion control).



        The trade-off is overall speed, however reliability is more important. For example the transfer of a 283k file across the radio link using scp now results in:



        100% 283KB 2.5KB/s 01:51


        This is a compromise I am happy enough with for now.
        However, long running bulk transfer processes are still problematic and eventually stall and never complete.



        For example, apt-get update, running for over an hour (stalling on the 11.7MB file), requires occasional carriage returns through the ssh terminal to continue running, and eventually do bog down to a very extended latency, though not failing all together.



        In the following screen scrape, the process was 1 hour plus, with a few CRs every 10-15 mins and a delay of approximately 5 mins between when ^C was sent vs the terminal responding:



        root@priotdev2:~# apt-get update
        Get:1 http://mirror.internode.on.net/pub/raspbian/raspbian stretch InRelease [15.0 kB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
        Get:2 http://archive.raspberrypi.org/debian stretch InRelease [25.3 kB]
        Get:4 http://archive.raspberrypi.org/debian stretch/main armhf Packages [145 kB]
        Get:5 http://archive.raspberrypi.org/debian stretch/ui armhf Packages [30.7 kB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        23% [3 Packages 270 kB/11.7 MB 2%] 2,864 B/s 1h 6min 15s
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        29% [3 Packages 1,263 kB/11.7 MB 11%] 131 B/s 22h 2min 19s
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        60% [3 Packages 5,883 kB/11.7 MB 50%] 16 B/s 4d 4h 13min 58s
        60% [3 Packages 5,902 kB/11.7 MB 51%] 1,531 B/s 1h 2min 38s
        66% [3 Packages 6,704 kB/11.7 MB 58%]

        66% [3 Packages 6,704 kB/11.7 MB 58%]
        Get:3 http://mirror.internode.on.net/pub/raspbian/raspbian stretch/main armhf Packages [11.7 MB]
        66% [3 Packages 6,735 kB/11.7 MB 58%]
        66% [3 Packages 6,735 kB/11.7 MB 58%]
        66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
        66% [3 Packages 6,745 kB/11.7 MB 58%] 32 B/s 1d 18h 37min 55s
        66% [3 Packages 6,745 kB/11.7 MB 58%]
        66% [3 Packages 6,745 kB/11.7 MB 58%]
        66% [3 Packages 6,746 kB/11.7 MB 58%] 230 B/s 5h 55min 46s

        66% [3 Packages 6,747 kB/11.7 MB 58%] 148 B/s 9h 12min 47s^C
        root@priotdev2:~# ^C
        root@priotdev2:~#
        root@priotdev2:~#


        Scrolling to the right on the above dump can be seen the abysmal through put (down to 16 and 32 bytes per second in some cases).



        To remove the end-to-end variables that also involve the satellite link, the apt process is actually using the upstream RPI as an apt cache (which is up to date), the transfers only represent traffic over the radio link.



        Any insights from the community on further improvements would be most welcome.







        share|improve this answer















        share|improve this answer



        share|improve this answer








        edited May 10 at 1:58


























        answered May 10 at 1:49









        BrendanMcL

        214




        214






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f442668%2fhow-do-i-make-ppp-reliable-over-lossy-radio-modems-using-pppd-and-tcp-kernel-set%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?