squashfs double caching problem
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
Let's say there's a large squashfs image on a file. So it's mounted as a loopback device. Now as I understand, kernels from 4.4 and up have eliminated double caching on loopback devices. But unfortunately, not from squashfs.
When you read some things from the mounted squashfs, the compressed portion from the image is read and cached by Linux. The decompressed data that you accesses is also cached, so that reading from it again will be very fast and won't necessitate decompression again.
The second cache is very good, since it provides fast access. The first cache is pretty much redundant and completely useless. It pollutes the RAM usage and will drop other cached entries (or force applications to swap) that are actually useful. This is basically the double caching problem.
As long as the files are already cached, decompressed, there's no point keeping a cached version of the compressed data.
If the kernel really had to drop these caches later, it would drop the compressed data cache first (less recently used), and then the decompressed data. You will only realize that when you read after, because it will have to re-read from the drive and decompress it again. But caching the compressed data on the drive is pointless!
So, to summarize:
- Keep the decompressed data cached so it's accessed very fast
- Don't keep the "compressed" squashfs data (on the file) at all
I tried mounting it with the sync
option, but it doesn't do anything.
There is a workaround that's more of a kludge than not. Basically, the following command drops the cache on the squashfs compressed data (good), and not on the decompressed data (good):
dd if=root.squashfs iflag=nocache count=0
I don't touch the mountpoint, as that would drop the decompressed data (which would be bad). Instead I touch the loop device's underlying file, as that's what I don't want cached (as it's pointless).
The problem is that the command has to be polled over and over again, since reads can happen from any application at any time. So the "kludge" is setting up the above command to execute every second or so.
Clearly, this is inelegant and a complete hack. But at least it shows exactly what I'm after. It just drops the file cache for that file itself (not the decompressed files). Imagine that command running every millisecond, that's exactly what I'd want, but without polling like this. Any better ways to do it?
linux cache squashfs
add a comment |Â
up vote
1
down vote
favorite
Let's say there's a large squashfs image on a file. So it's mounted as a loopback device. Now as I understand, kernels from 4.4 and up have eliminated double caching on loopback devices. But unfortunately, not from squashfs.
When you read some things from the mounted squashfs, the compressed portion from the image is read and cached by Linux. The decompressed data that you accesses is also cached, so that reading from it again will be very fast and won't necessitate decompression again.
The second cache is very good, since it provides fast access. The first cache is pretty much redundant and completely useless. It pollutes the RAM usage and will drop other cached entries (or force applications to swap) that are actually useful. This is basically the double caching problem.
As long as the files are already cached, decompressed, there's no point keeping a cached version of the compressed data.
If the kernel really had to drop these caches later, it would drop the compressed data cache first (less recently used), and then the decompressed data. You will only realize that when you read after, because it will have to re-read from the drive and decompress it again. But caching the compressed data on the drive is pointless!
So, to summarize:
- Keep the decompressed data cached so it's accessed very fast
- Don't keep the "compressed" squashfs data (on the file) at all
I tried mounting it with the sync
option, but it doesn't do anything.
There is a workaround that's more of a kludge than not. Basically, the following command drops the cache on the squashfs compressed data (good), and not on the decompressed data (good):
dd if=root.squashfs iflag=nocache count=0
I don't touch the mountpoint, as that would drop the decompressed data (which would be bad). Instead I touch the loop device's underlying file, as that's what I don't want cached (as it's pointless).
The problem is that the command has to be polled over and over again, since reads can happen from any application at any time. So the "kludge" is setting up the above command to execute every second or so.
Clearly, this is inelegant and a complete hack. But at least it shows exactly what I'm after. It just drops the file cache for that file itself (not the decompressed files). Imagine that command running every millisecond, that's exactly what I'd want, but without polling like this. Any better ways to do it?
linux cache squashfs
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Let's say there's a large squashfs image on a file. So it's mounted as a loopback device. Now as I understand, kernels from 4.4 and up have eliminated double caching on loopback devices. But unfortunately, not from squashfs.
When you read some things from the mounted squashfs, the compressed portion from the image is read and cached by Linux. The decompressed data that you accesses is also cached, so that reading from it again will be very fast and won't necessitate decompression again.
The second cache is very good, since it provides fast access. The first cache is pretty much redundant and completely useless. It pollutes the RAM usage and will drop other cached entries (or force applications to swap) that are actually useful. This is basically the double caching problem.
As long as the files are already cached, decompressed, there's no point keeping a cached version of the compressed data.
If the kernel really had to drop these caches later, it would drop the compressed data cache first (less recently used), and then the decompressed data. You will only realize that when you read after, because it will have to re-read from the drive and decompress it again. But caching the compressed data on the drive is pointless!
So, to summarize:
- Keep the decompressed data cached so it's accessed very fast
- Don't keep the "compressed" squashfs data (on the file) at all
I tried mounting it with the sync
option, but it doesn't do anything.
There is a workaround that's more of a kludge than not. Basically, the following command drops the cache on the squashfs compressed data (good), and not on the decompressed data (good):
dd if=root.squashfs iflag=nocache count=0
I don't touch the mountpoint, as that would drop the decompressed data (which would be bad). Instead I touch the loop device's underlying file, as that's what I don't want cached (as it's pointless).
The problem is that the command has to be polled over and over again, since reads can happen from any application at any time. So the "kludge" is setting up the above command to execute every second or so.
Clearly, this is inelegant and a complete hack. But at least it shows exactly what I'm after. It just drops the file cache for that file itself (not the decompressed files). Imagine that command running every millisecond, that's exactly what I'd want, but without polling like this. Any better ways to do it?
linux cache squashfs
Let's say there's a large squashfs image on a file. So it's mounted as a loopback device. Now as I understand, kernels from 4.4 and up have eliminated double caching on loopback devices. But unfortunately, not from squashfs.
When you read some things from the mounted squashfs, the compressed portion from the image is read and cached by Linux. The decompressed data that you accesses is also cached, so that reading from it again will be very fast and won't necessitate decompression again.
The second cache is very good, since it provides fast access. The first cache is pretty much redundant and completely useless. It pollutes the RAM usage and will drop other cached entries (or force applications to swap) that are actually useful. This is basically the double caching problem.
As long as the files are already cached, decompressed, there's no point keeping a cached version of the compressed data.
If the kernel really had to drop these caches later, it would drop the compressed data cache first (less recently used), and then the decompressed data. You will only realize that when you read after, because it will have to re-read from the drive and decompress it again. But caching the compressed data on the drive is pointless!
So, to summarize:
- Keep the decompressed data cached so it's accessed very fast
- Don't keep the "compressed" squashfs data (on the file) at all
I tried mounting it with the sync
option, but it doesn't do anything.
There is a workaround that's more of a kludge than not. Basically, the following command drops the cache on the squashfs compressed data (good), and not on the decompressed data (good):
dd if=root.squashfs iflag=nocache count=0
I don't touch the mountpoint, as that would drop the decompressed data (which would be bad). Instead I touch the loop device's underlying file, as that's what I don't want cached (as it's pointless).
The problem is that the command has to be polled over and over again, since reads can happen from any application at any time. So the "kludge" is setting up the above command to execute every second or so.
Clearly, this is inelegant and a complete hack. But at least it shows exactly what I'm after. It just drops the file cache for that file itself (not the decompressed files). Imagine that command running every millisecond, that's exactly what I'd want, but without polling like this. Any better ways to do it?
linux cache squashfs
asked May 17 at 15:41
kktsuri
655
655
add a comment |Â
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f444408%2fsquashfs-double-caching-problem%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password