Is it possible in linux to disable filesystem caching for specific files?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I have some large files and i am ok with them being read at disk I/O capacity. I wish to have file-system cache free for other files. Is it possible to turn off file-system caching for specific files, on Linux?
I wish to do this programmatically via native lib + java.







share|improve this question





















  • You can definitely do it per file open. (every time you open a file, you can chose not to cache).
    – ctrl-alt-delor
    Jun 20 at 23:51














up vote
1
down vote

favorite












I have some large files and i am ok with them being read at disk I/O capacity. I wish to have file-system cache free for other files. Is it possible to turn off file-system caching for specific files, on Linux?
I wish to do this programmatically via native lib + java.







share|improve this question





















  • You can definitely do it per file open. (every time you open a file, you can chose not to cache).
    – ctrl-alt-delor
    Jun 20 at 23:51












up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have some large files and i am ok with them being read at disk I/O capacity. I wish to have file-system cache free for other files. Is it possible to turn off file-system caching for specific files, on Linux?
I wish to do this programmatically via native lib + java.







share|improve this question













I have some large files and i am ok with them being read at disk I/O capacity. I wish to have file-system cache free for other files. Is it possible to turn off file-system caching for specific files, on Linux?
I wish to do this programmatically via native lib + java.









share|improve this question












share|improve this question




share|improve this question








edited Jun 20 at 23:50









ctrl-alt-delor

8,73831947




8,73831947









asked Jun 20 at 23:06









Urvishsinh Mahida

1062




1062











  • You can definitely do it per file open. (every time you open a file, you can chose not to cache).
    – ctrl-alt-delor
    Jun 20 at 23:51
















  • You can definitely do it per file open. (every time you open a file, you can chose not to cache).
    – ctrl-alt-delor
    Jun 20 at 23:51















You can definitely do it per file open. (every time you open a file, you can chose not to cache).
– ctrl-alt-delor
Jun 20 at 23:51




You can definitely do it per file open. (every time you open a file, you can chose not to cache).
– ctrl-alt-delor
Jun 20 at 23:51










2 Answers
2






active

oldest

votes

















up vote
2
down vote













You're looking for your Java equivalent of the O_DIRECT flag for open(2). See http://man7.org/linux/man-pages/man2/open.2.html






share|improve this answer

















  • 1




    Implemented in OpenJDK since 2017
    – ajeh
    Jun 21 at 20:09










  • @ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
    – Andrew Henle
    Jun 22 at 14:43

















up vote
2
down vote













You can do so for an opened instance of the file, but not persistently for the file itself. You do so per instance of the opened file by using direct IO. I'm not sure how to do this in Java, but in C and C++, you pass the O_DIRECT flag to the open() call.



Note however that this has a couple of potentially problematic implications, namely:



  • It's downright dangerous on certain filesystems. Most notably, current versions of BTRFS have serious issues with direct IO when you're writing to the file.

  • You can't mix direct IO with regular cached I/O unless you use some form of synchronization. Cached writes won't show up for certain to direct IO reads until you call fsync() or fdatasync(), and direct IO writes may not show up for cached IO reads ever.

There is however an alternative method if you can tolerate having the data temporarily in cache. You can use the POSIX fadvise interface (through the posix_fadvise system call on Linux) to tell the kernel you don't need data from the file when you're done reading it. By using the POSIX_FADV_DONTNEED flag, you can tell the kernel to drop a specific region of a particular file from cache. You can actually do this as you are processing the file too (by reading a chunk, and then immediately after reading calling posix_fadvise on that region of the file), though the regions you call this on have to be aligned to the system's page size. This is generally the preferred portable method of handling things, as it works on any POSIX compliant system with the real-time extensions (which is pretty much any POSIX compliant system).






share|improve this answer





















  • +1 very nice detail
    – roaima
    Jun 21 at 20:12











  • From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
    – Andrew Henle
    Jun 22 at 14:47






  • 1




    Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
    – Andrew Henle
    Jun 22 at 14:50











  • Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
    – Andrew Henle
    Jun 22 at 14:55











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f450990%2fis-it-possible-in-linux-to-disable-filesystem-caching-for-specific-files%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote













You're looking for your Java equivalent of the O_DIRECT flag for open(2). See http://man7.org/linux/man-pages/man2/open.2.html






share|improve this answer

















  • 1




    Implemented in OpenJDK since 2017
    – ajeh
    Jun 21 at 20:09










  • @ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
    – Andrew Henle
    Jun 22 at 14:43














up vote
2
down vote













You're looking for your Java equivalent of the O_DIRECT flag for open(2). See http://man7.org/linux/man-pages/man2/open.2.html






share|improve this answer

















  • 1




    Implemented in OpenJDK since 2017
    – ajeh
    Jun 21 at 20:09










  • @ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
    – Andrew Henle
    Jun 22 at 14:43












up vote
2
down vote










up vote
2
down vote









You're looking for your Java equivalent of the O_DIRECT flag for open(2). See http://man7.org/linux/man-pages/man2/open.2.html






share|improve this answer













You're looking for your Java equivalent of the O_DIRECT flag for open(2). See http://man7.org/linux/man-pages/man2/open.2.html







share|improve this answer













share|improve this answer



share|improve this answer











answered Jun 20 at 23:14









roaima

39.2k544105




39.2k544105







  • 1




    Implemented in OpenJDK since 2017
    – ajeh
    Jun 21 at 20:09










  • @ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
    – Andrew Henle
    Jun 22 at 14:43












  • 1




    Implemented in OpenJDK since 2017
    – ajeh
    Jun 21 at 20:09










  • @ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
    – Andrew Henle
    Jun 22 at 14:43







1




1




Implemented in OpenJDK since 2017
– ajeh
Jun 21 at 20:09




Implemented in OpenJDK since 2017
– ajeh
Jun 21 at 20:09












@ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
– Andrew Henle
Jun 22 at 14:43




@ajeh That looks to be only OpenJDK 9, and I presume 10, per bugs.openjdk.java.net/browse/JDK-8164900
– Andrew Henle
Jun 22 at 14:43












up vote
2
down vote













You can do so for an opened instance of the file, but not persistently for the file itself. You do so per instance of the opened file by using direct IO. I'm not sure how to do this in Java, but in C and C++, you pass the O_DIRECT flag to the open() call.



Note however that this has a couple of potentially problematic implications, namely:



  • It's downright dangerous on certain filesystems. Most notably, current versions of BTRFS have serious issues with direct IO when you're writing to the file.

  • You can't mix direct IO with regular cached I/O unless you use some form of synchronization. Cached writes won't show up for certain to direct IO reads until you call fsync() or fdatasync(), and direct IO writes may not show up for cached IO reads ever.

There is however an alternative method if you can tolerate having the data temporarily in cache. You can use the POSIX fadvise interface (through the posix_fadvise system call on Linux) to tell the kernel you don't need data from the file when you're done reading it. By using the POSIX_FADV_DONTNEED flag, you can tell the kernel to drop a specific region of a particular file from cache. You can actually do this as you are processing the file too (by reading a chunk, and then immediately after reading calling posix_fadvise on that region of the file), though the regions you call this on have to be aligned to the system's page size. This is generally the preferred portable method of handling things, as it works on any POSIX compliant system with the real-time extensions (which is pretty much any POSIX compliant system).






share|improve this answer





















  • +1 very nice detail
    – roaima
    Jun 21 at 20:12











  • From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
    – Andrew Henle
    Jun 22 at 14:47






  • 1




    Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
    – Andrew Henle
    Jun 22 at 14:50











  • Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
    – Andrew Henle
    Jun 22 at 14:55















up vote
2
down vote













You can do so for an opened instance of the file, but not persistently for the file itself. You do so per instance of the opened file by using direct IO. I'm not sure how to do this in Java, but in C and C++, you pass the O_DIRECT flag to the open() call.



Note however that this has a couple of potentially problematic implications, namely:



  • It's downright dangerous on certain filesystems. Most notably, current versions of BTRFS have serious issues with direct IO when you're writing to the file.

  • You can't mix direct IO with regular cached I/O unless you use some form of synchronization. Cached writes won't show up for certain to direct IO reads until you call fsync() or fdatasync(), and direct IO writes may not show up for cached IO reads ever.

There is however an alternative method if you can tolerate having the data temporarily in cache. You can use the POSIX fadvise interface (through the posix_fadvise system call on Linux) to tell the kernel you don't need data from the file when you're done reading it. By using the POSIX_FADV_DONTNEED flag, you can tell the kernel to drop a specific region of a particular file from cache. You can actually do this as you are processing the file too (by reading a chunk, and then immediately after reading calling posix_fadvise on that region of the file), though the regions you call this on have to be aligned to the system's page size. This is generally the preferred portable method of handling things, as it works on any POSIX compliant system with the real-time extensions (which is pretty much any POSIX compliant system).






share|improve this answer





















  • +1 very nice detail
    – roaima
    Jun 21 at 20:12











  • From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
    – Andrew Henle
    Jun 22 at 14:47






  • 1




    Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
    – Andrew Henle
    Jun 22 at 14:50











  • Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
    – Andrew Henle
    Jun 22 at 14:55













up vote
2
down vote










up vote
2
down vote









You can do so for an opened instance of the file, but not persistently for the file itself. You do so per instance of the opened file by using direct IO. I'm not sure how to do this in Java, but in C and C++, you pass the O_DIRECT flag to the open() call.



Note however that this has a couple of potentially problematic implications, namely:



  • It's downright dangerous on certain filesystems. Most notably, current versions of BTRFS have serious issues with direct IO when you're writing to the file.

  • You can't mix direct IO with regular cached I/O unless you use some form of synchronization. Cached writes won't show up for certain to direct IO reads until you call fsync() or fdatasync(), and direct IO writes may not show up for cached IO reads ever.

There is however an alternative method if you can tolerate having the data temporarily in cache. You can use the POSIX fadvise interface (through the posix_fadvise system call on Linux) to tell the kernel you don't need data from the file when you're done reading it. By using the POSIX_FADV_DONTNEED flag, you can tell the kernel to drop a specific region of a particular file from cache. You can actually do this as you are processing the file too (by reading a chunk, and then immediately after reading calling posix_fadvise on that region of the file), though the regions you call this on have to be aligned to the system's page size. This is generally the preferred portable method of handling things, as it works on any POSIX compliant system with the real-time extensions (which is pretty much any POSIX compliant system).






share|improve this answer













You can do so for an opened instance of the file, but not persistently for the file itself. You do so per instance of the opened file by using direct IO. I'm not sure how to do this in Java, but in C and C++, you pass the O_DIRECT flag to the open() call.



Note however that this has a couple of potentially problematic implications, namely:



  • It's downright dangerous on certain filesystems. Most notably, current versions of BTRFS have serious issues with direct IO when you're writing to the file.

  • You can't mix direct IO with regular cached I/O unless you use some form of synchronization. Cached writes won't show up for certain to direct IO reads until you call fsync() or fdatasync(), and direct IO writes may not show up for cached IO reads ever.

There is however an alternative method if you can tolerate having the data temporarily in cache. You can use the POSIX fadvise interface (through the posix_fadvise system call on Linux) to tell the kernel you don't need data from the file when you're done reading it. By using the POSIX_FADV_DONTNEED flag, you can tell the kernel to drop a specific region of a particular file from cache. You can actually do this as you are processing the file too (by reading a chunk, and then immediately after reading calling posix_fadvise on that region of the file), though the regions you call this on have to be aligned to the system's page size. This is generally the preferred portable method of handling things, as it works on any POSIX compliant system with the real-time extensions (which is pretty much any POSIX compliant system).







share|improve this answer













share|improve this answer



share|improve this answer











answered Jun 21 at 19:37









Austin Hemmelgarn

5,049915




5,049915











  • +1 very nice detail
    – roaima
    Jun 21 at 20:12











  • From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
    – Andrew Henle
    Jun 22 at 14:47






  • 1




    Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
    – Andrew Henle
    Jun 22 at 14:50











  • Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
    – Andrew Henle
    Jun 22 at 14:55

















  • +1 very nice detail
    – roaima
    Jun 21 at 20:12











  • From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
    – Andrew Henle
    Jun 22 at 14:47






  • 1




    Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
    – Andrew Henle
    Jun 22 at 14:50











  • Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
    – Andrew Henle
    Jun 22 at 14:55
















+1 very nice detail
– roaima
Jun 21 at 20:12





+1 very nice detail
– roaima
Jun 21 at 20:12













From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
– Andrew Henle
Jun 22 at 14:47




From the Linux open() man page: "Under Linux 2.4, transfer sizes, and the alignment of the user buffer and the file offset must all be multiples of the logical block size of the filesystem. Since Linux 2.6.0, alignment to the logical block size of the underlying storage (typically 512 bytes) suffices."
– Andrew Henle
Jun 22 at 14:47




1




1




Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
– Andrew Henle
Jun 22 at 14:50





Also from that man page: ""The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances."—Linus" Well, on Linus Torvald's OS, direct IO does act deranged. On Irix and Solaris, though, it works just fine. So, if you do use direct IO on Linux, test thoroughly on your entire system, and if you change anything, test everything again.
– Andrew Henle
Jun 22 at 14:50













Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
– Andrew Henle
Jun 22 at 14:55





Note also on Linux that the underlying filesystem may or may not support direct IO, and even if it does officially support it, that support may be sketchy at best (per your comment on BTRFS...). For example, only full page- or block-size IO operations may be permitted, making reading/writing the final smaller-than-a-full-page bit of data in a file impossible.
– Andrew Henle
Jun 22 at 14:55













 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f450990%2fis-it-possible-in-linux-to-disable-filesystem-caching-for-specific-files%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?