NFS caches expiring unexpectedly

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2















I need to build a NFS4 + CacheFilesd setup on a high latency, low throughput link where local caches never expire. The only invalidation semantics must be the NFS Server callbacks when something is updated (which is working fine by the way, changes on the server files are instantly passed on to the client). This mount is read-only, so no locks are in place.



The issue: Even though it will always correctly read the requested file from the local cache, it keeps fetching the files attributes if said hasn't been accessed in the last 60 seconds or so regardless of actimeo=86400 being set. It seems to have something to do with how often the file is opened since it works perfectly fine as long as I keep opening it every 50 seconds or less.



Proof of concept:



(Server network latency is artificially set to 2000ms so I can clearly pinpoint when attribute checking is being performed)



Wait 50 seconds after each request yields 100% cache hit as intended. This will continue indefinitely:



root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 50 ; done
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00


Now setting the delay between requests to 70 seconds, see how inconsistent the outcome is:



root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 70 ; done
0.00
0.00
0.00
0.00
0.00
0.00
6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
0.00
0.00
0.00
6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
0.00
0.00
0.00


Also, nfsstats adds an extra "getattr" when those delays occur:



create delegpurge delegreturn getattr getfh link 
2 0% 0 0% 82 0% 177063 10% 87644 5% 0 0%


And finally, when delay is set to 110 seconds or more, every single request ends up getting checked against the server for some reason:



root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 110 ; done 
6.00
6.00
6.00
6.00
6.00
6.00
6.00


I managed to reproduce the very same behavior by serving this 2 bytes-long file via HTTP with nginx instead of "cat" and through "ioping" as well.



Cachefiled is not purging anything on its own since there is more than enough space in its partition:



/dev/vdb 20G 3,0G 16G 17% /disk2/fscache


I know it is only reaching out for the files' metadata and not the content itself because when I perform the same test against a 2GB file (which is more than the client's physical memory size), it hangs for 2 seconds (the network established delay) and then it starts reading the Cachefilesd locally cached file from the disk as expected.



I really don't understand what is going on during those 1-2 minutes that causes the client to recheck with the server for updates, and that kills off the purpose of my setup.



/etc/exports:



/cache 192.168.122.234(ro,async,no_subtree_check)


Client mount:



root@client:~# mount -t nfs4 -o lookupcache=all,actimeo=86400,nocto,ro,intr,soft,proto=tcp,async,fsc 192.168.122.1:/cache /nfs-mount
root@client:~# cat /proc/mounts | grep nfs
192.168.122.1:/cache /nfs-mount nfs4 ro,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=86400,acregmax=86400,acdirmin=86400,acdirmax=86400,soft,nocto,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.234,fsc,local_lock=none,addr=192.168.122.1 0 0


Server is Centos 7, client is Ubuntu 16.04. Packages are obtained from the distro's repos.



root@client:~# dpkg -l | grep nfs
ii libnfsidmap2:amd64 0.25-5 amd64 NFS idmapping library
ii nfs-common 1:1.2.8-9ubuntu12 amd64 NFS support files common to client and server
ii nfs4-acl-tools 0.3.3-3 amd64 Commandline and GUI ACL utilities for the NFSv4 client



I also tried using Ubuntu as a server and CentOS 7 as client with no avail.










share|improve this question






























    2















    I need to build a NFS4 + CacheFilesd setup on a high latency, low throughput link where local caches never expire. The only invalidation semantics must be the NFS Server callbacks when something is updated (which is working fine by the way, changes on the server files are instantly passed on to the client). This mount is read-only, so no locks are in place.



    The issue: Even though it will always correctly read the requested file from the local cache, it keeps fetching the files attributes if said hasn't been accessed in the last 60 seconds or so regardless of actimeo=86400 being set. It seems to have something to do with how often the file is opened since it works perfectly fine as long as I keep opening it every 50 seconds or less.



    Proof of concept:



    (Server network latency is artificially set to 2000ms so I can clearly pinpoint when attribute checking is being performed)



    Wait 50 seconds after each request yields 100% cache hit as intended. This will continue indefinitely:



    root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 50 ; done
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00


    Now setting the delay between requests to 70 seconds, see how inconsistent the outcome is:



    root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 70 ; done
    0.00
    0.00
    0.00
    0.00
    0.00
    0.00
    6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
    4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
    0.00
    0.00
    0.00
    6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
    4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
    0.00
    0.00
    0.00


    Also, nfsstats adds an extra "getattr" when those delays occur:



    create delegpurge delegreturn getattr getfh link 
    2 0% 0 0% 82 0% 177063 10% 87644 5% 0 0%


    And finally, when delay is set to 110 seconds or more, every single request ends up getting checked against the server for some reason:



    root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 110 ; done 
    6.00
    6.00
    6.00
    6.00
    6.00
    6.00
    6.00


    I managed to reproduce the very same behavior by serving this 2 bytes-long file via HTTP with nginx instead of "cat" and through "ioping" as well.



    Cachefiled is not purging anything on its own since there is more than enough space in its partition:



    /dev/vdb 20G 3,0G 16G 17% /disk2/fscache


    I know it is only reaching out for the files' metadata and not the content itself because when I perform the same test against a 2GB file (which is more than the client's physical memory size), it hangs for 2 seconds (the network established delay) and then it starts reading the Cachefilesd locally cached file from the disk as expected.



    I really don't understand what is going on during those 1-2 minutes that causes the client to recheck with the server for updates, and that kills off the purpose of my setup.



    /etc/exports:



    /cache 192.168.122.234(ro,async,no_subtree_check)


    Client mount:



    root@client:~# mount -t nfs4 -o lookupcache=all,actimeo=86400,nocto,ro,intr,soft,proto=tcp,async,fsc 192.168.122.1:/cache /nfs-mount
    root@client:~# cat /proc/mounts | grep nfs
    192.168.122.1:/cache /nfs-mount nfs4 ro,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=86400,acregmax=86400,acdirmin=86400,acdirmax=86400,soft,nocto,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.234,fsc,local_lock=none,addr=192.168.122.1 0 0


    Server is Centos 7, client is Ubuntu 16.04. Packages are obtained from the distro's repos.



    root@client:~# dpkg -l | grep nfs
    ii libnfsidmap2:amd64 0.25-5 amd64 NFS idmapping library
    ii nfs-common 1:1.2.8-9ubuntu12 amd64 NFS support files common to client and server
    ii nfs4-acl-tools 0.3.3-3 amd64 Commandline and GUI ACL utilities for the NFSv4 client



    I also tried using Ubuntu as a server and CentOS 7 as client with no avail.










    share|improve this question


























      2












      2








      2








      I need to build a NFS4 + CacheFilesd setup on a high latency, low throughput link where local caches never expire. The only invalidation semantics must be the NFS Server callbacks when something is updated (which is working fine by the way, changes on the server files are instantly passed on to the client). This mount is read-only, so no locks are in place.



      The issue: Even though it will always correctly read the requested file from the local cache, it keeps fetching the files attributes if said hasn't been accessed in the last 60 seconds or so regardless of actimeo=86400 being set. It seems to have something to do with how often the file is opened since it works perfectly fine as long as I keep opening it every 50 seconds or less.



      Proof of concept:



      (Server network latency is artificially set to 2000ms so I can clearly pinpoint when attribute checking is being performed)



      Wait 50 seconds after each request yields 100% cache hit as intended. This will continue indefinitely:



      root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 50 ; done
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00


      Now setting the delay between requests to 70 seconds, see how inconsistent the outcome is:



      root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 70 ; done
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      0.00
      0.00
      0.00
      6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      0.00
      0.00
      0.00


      Also, nfsstats adds an extra "getattr" when those delays occur:



      create delegpurge delegreturn getattr getfh link 
      2 0% 0 0% 82 0% 177063 10% 87644 5% 0 0%


      And finally, when delay is set to 110 seconds or more, every single request ends up getting checked against the server for some reason:



      root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 110 ; done 
      6.00
      6.00
      6.00
      6.00
      6.00
      6.00
      6.00


      I managed to reproduce the very same behavior by serving this 2 bytes-long file via HTTP with nginx instead of "cat" and through "ioping" as well.



      Cachefiled is not purging anything on its own since there is more than enough space in its partition:



      /dev/vdb 20G 3,0G 16G 17% /disk2/fscache


      I know it is only reaching out for the files' metadata and not the content itself because when I perform the same test against a 2GB file (which is more than the client's physical memory size), it hangs for 2 seconds (the network established delay) and then it starts reading the Cachefilesd locally cached file from the disk as expected.



      I really don't understand what is going on during those 1-2 minutes that causes the client to recheck with the server for updates, and that kills off the purpose of my setup.



      /etc/exports:



      /cache 192.168.122.234(ro,async,no_subtree_check)


      Client mount:



      root@client:~# mount -t nfs4 -o lookupcache=all,actimeo=86400,nocto,ro,intr,soft,proto=tcp,async,fsc 192.168.122.1:/cache /nfs-mount
      root@client:~# cat /proc/mounts | grep nfs
      192.168.122.1:/cache /nfs-mount nfs4 ro,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=86400,acregmax=86400,acdirmin=86400,acdirmax=86400,soft,nocto,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.234,fsc,local_lock=none,addr=192.168.122.1 0 0


      Server is Centos 7, client is Ubuntu 16.04. Packages are obtained from the distro's repos.



      root@client:~# dpkg -l | grep nfs
      ii libnfsidmap2:amd64 0.25-5 amd64 NFS idmapping library
      ii nfs-common 1:1.2.8-9ubuntu12 amd64 NFS support files common to client and server
      ii nfs4-acl-tools 0.3.3-3 amd64 Commandline and GUI ACL utilities for the NFSv4 client



      I also tried using Ubuntu as a server and CentOS 7 as client with no avail.










      share|improve this question
















      I need to build a NFS4 + CacheFilesd setup on a high latency, low throughput link where local caches never expire. The only invalidation semantics must be the NFS Server callbacks when something is updated (which is working fine by the way, changes on the server files are instantly passed on to the client). This mount is read-only, so no locks are in place.



      The issue: Even though it will always correctly read the requested file from the local cache, it keeps fetching the files attributes if said hasn't been accessed in the last 60 seconds or so regardless of actimeo=86400 being set. It seems to have something to do with how often the file is opened since it works perfectly fine as long as I keep opening it every 50 seconds or less.



      Proof of concept:



      (Server network latency is artificially set to 2000ms so I can clearly pinpoint when attribute checking is being performed)



      Wait 50 seconds after each request yields 100% cache hit as intended. This will continue indefinitely:



      root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 50 ; done
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00


      Now setting the delay between requests to 70 seconds, see how inconsistent the outcome is:



      root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 70 ; done
      0.00
      0.00
      0.00
      0.00
      0.00
      0.00
      6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      0.00
      0.00
      0.00
      6.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      4.00 # <- Attributes fetched. Debug log recorded "NFS: nfs_update_inode(0:69/68697599 fh_crc=0xb9b7a69e ct=2 info=0x26040)"
      0.00
      0.00
      0.00


      Also, nfsstats adds an extra "getattr" when those delays occur:



      create delegpurge delegreturn getattr getfh link 
      2 0% 0 0% 82 0% 177063 10% 87644 5% 0 0%


      And finally, when delay is set to 110 seconds or more, every single request ends up getting checked against the server for some reason:



      root@client:~# while : ; do /usr/bin/time -f%e cat /nfs-mount/2bytes-file > /dev/null ; sleep 110 ; done 
      6.00
      6.00
      6.00
      6.00
      6.00
      6.00
      6.00


      I managed to reproduce the very same behavior by serving this 2 bytes-long file via HTTP with nginx instead of "cat" and through "ioping" as well.



      Cachefiled is not purging anything on its own since there is more than enough space in its partition:



      /dev/vdb 20G 3,0G 16G 17% /disk2/fscache


      I know it is only reaching out for the files' metadata and not the content itself because when I perform the same test against a 2GB file (which is more than the client's physical memory size), it hangs for 2 seconds (the network established delay) and then it starts reading the Cachefilesd locally cached file from the disk as expected.



      I really don't understand what is going on during those 1-2 minutes that causes the client to recheck with the server for updates, and that kills off the purpose of my setup.



      /etc/exports:



      /cache 192.168.122.234(ro,async,no_subtree_check)


      Client mount:



      root@client:~# mount -t nfs4 -o lookupcache=all,actimeo=86400,nocto,ro,intr,soft,proto=tcp,async,fsc 192.168.122.1:/cache /nfs-mount
      root@client:~# cat /proc/mounts | grep nfs
      192.168.122.1:/cache /nfs-mount nfs4 ro,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,acregmin=86400,acregmax=86400,acdirmin=86400,acdirmax=86400,soft,nocto,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.122.234,fsc,local_lock=none,addr=192.168.122.1 0 0


      Server is Centos 7, client is Ubuntu 16.04. Packages are obtained from the distro's repos.



      root@client:~# dpkg -l | grep nfs
      ii libnfsidmap2:amd64 0.25-5 amd64 NFS idmapping library
      ii nfs-common 1:1.2.8-9ubuntu12 amd64 NFS support files common to client and server
      ii nfs4-acl-tools 0.3.3-3 amd64 Commandline and GUI ACL utilities for the NFSv4 client



      I also tried using Ubuntu as a server and CentOS 7 as client with no avail.







      nfs cache timeout






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 9 at 14:10









      Rui F Ribeiro

      41.9k1483142




      41.9k1483142










      asked Sep 3 '16 at 11:29









      G.AshburnG.Ashburn

      762




      762




















          0






          active

          oldest

          votes












          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f307648%2fnfs-caches-expiring-unexpectedly%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f307648%2fnfs-caches-expiring-unexpectedly%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown






          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          Displaying single band from multi-band raster using QGIS

          How many registers does an x86_64 CPU actually have?