How to cause the kernel to compact fragmented memory
Clash Royale CLAN TAG#URR8PPP
up vote
8
down vote
favorite
I am running Fedora 26
.
This is for a very strange assignment given by my algorithms professor. The assignment says:
Memory fragmentation in C:
Design, implement, and execute a C-program that does the following: It allocates memory for a sequence of3m
arrays size 800,000 elements each; then it explicitly deallocates all even-numbered arrays and allocates a sequence ofm
arrays of size 900,000 elements each. Measure the amounts of time your program requires for the allocation of the first sequence and for the second sequence. Choosem
to exhaust almost all of main memory available to your program."
The overall goal of this is to fragment memory then request slightly more than what is available as a contiguous chunk, forcing the operating system to compact or defragment memory.
In class I asked how we should go about doing this since memory is visualized and not actually contiguous, to which he replied: "Well you'll have to turn [virtual memory] off."
Some other students asked in class how we should know when we have hit this "garbage collection," and he said that: "The timing for the second allocation should be greater than the first because of the time taken by garbage collection"
After searching around a bit, the closest thing I could find to disabling virtual memory was to disable swap memory with swapoff -a
. I disabled my desktop environment and compiled and ran my program from the native terminal (to avoid possible interference from other processes, especially a heavy one like the Desktop Environment). I did this and ran my program with increasing m
until I reached a point where the timing for the second allocation was greater than the first.
I ran the program with increasing m
and eventually found a point where the time for the second allocation was more than the time for the first allocation. Along the way however I hit a point where the process was killed before the second allocation. I checked dmesg
and saw that it was killed by oom
-killer. I found and read several articles about about oom
-killer and discovered that you could disable over allocation of memory by the kernel.
I did this and ran my program again, only this time I was not able to find a m
such that the timing of the second was higher than the first. Eventually with larger and larger m (though much smaller than when overallocation was enabled) malloc would fail and my program would terminate.
I have three questions, the first of which isn't really that important:
Is garbage collection the correct term for this? My professor is very adamant in saying that this is garbage collection, but I was under the assumption that garbage collection was something done by programming languages and that this would be considered more defragmenting.
Is compaction like he wants possible on a linux system?
Why was I able to reach a point where the time for the second allocation was higher than the first when I disable swap but still had overallocation of memory enabled? Did compaction actually take place? If so why was I not able to hit a point where compaction happened after I disabled overallocation of memory?
memory swap defragmentation
add a comment |Â
up vote
8
down vote
favorite
I am running Fedora 26
.
This is for a very strange assignment given by my algorithms professor. The assignment says:
Memory fragmentation in C:
Design, implement, and execute a C-program that does the following: It allocates memory for a sequence of3m
arrays size 800,000 elements each; then it explicitly deallocates all even-numbered arrays and allocates a sequence ofm
arrays of size 900,000 elements each. Measure the amounts of time your program requires for the allocation of the first sequence and for the second sequence. Choosem
to exhaust almost all of main memory available to your program."
The overall goal of this is to fragment memory then request slightly more than what is available as a contiguous chunk, forcing the operating system to compact or defragment memory.
In class I asked how we should go about doing this since memory is visualized and not actually contiguous, to which he replied: "Well you'll have to turn [virtual memory] off."
Some other students asked in class how we should know when we have hit this "garbage collection," and he said that: "The timing for the second allocation should be greater than the first because of the time taken by garbage collection"
After searching around a bit, the closest thing I could find to disabling virtual memory was to disable swap memory with swapoff -a
. I disabled my desktop environment and compiled and ran my program from the native terminal (to avoid possible interference from other processes, especially a heavy one like the Desktop Environment). I did this and ran my program with increasing m
until I reached a point where the timing for the second allocation was greater than the first.
I ran the program with increasing m
and eventually found a point where the time for the second allocation was more than the time for the first allocation. Along the way however I hit a point where the process was killed before the second allocation. I checked dmesg
and saw that it was killed by oom
-killer. I found and read several articles about about oom
-killer and discovered that you could disable over allocation of memory by the kernel.
I did this and ran my program again, only this time I was not able to find a m
such that the timing of the second was higher than the first. Eventually with larger and larger m (though much smaller than when overallocation was enabled) malloc would fail and my program would terminate.
I have three questions, the first of which isn't really that important:
Is garbage collection the correct term for this? My professor is very adamant in saying that this is garbage collection, but I was under the assumption that garbage collection was something done by programming languages and that this would be considered more defragmenting.
Is compaction like he wants possible on a linux system?
Why was I able to reach a point where the time for the second allocation was higher than the first when I disable swap but still had overallocation of memory enabled? Did compaction actually take place? If so why was I not able to hit a point where compaction happened after I disabled overallocation of memory?
memory swap defragmentation
You cannot "turn off" virtual memory. Also, be aware that in Linux you can logically allocate more memory than you actually have -- the kernel won't actually allocate the pages until you write to them.
â Andy Dalton
Feb 8 at 6:29
3
I was expecting yet another Write my homework! question. But rather this is a My homework seems badly specified, ill-thought, and impossible. Is it? question. Some of this is Stack Overflow territory, and you will find a lot of Q&As there along the lines of (just to pick one example at random) stackoverflow.com/questions/4039274 .
â JdeBP
Feb 8 at 7:12
add a comment |Â
up vote
8
down vote
favorite
up vote
8
down vote
favorite
I am running Fedora 26
.
This is for a very strange assignment given by my algorithms professor. The assignment says:
Memory fragmentation in C:
Design, implement, and execute a C-program that does the following: It allocates memory for a sequence of3m
arrays size 800,000 elements each; then it explicitly deallocates all even-numbered arrays and allocates a sequence ofm
arrays of size 900,000 elements each. Measure the amounts of time your program requires for the allocation of the first sequence and for the second sequence. Choosem
to exhaust almost all of main memory available to your program."
The overall goal of this is to fragment memory then request slightly more than what is available as a contiguous chunk, forcing the operating system to compact or defragment memory.
In class I asked how we should go about doing this since memory is visualized and not actually contiguous, to which he replied: "Well you'll have to turn [virtual memory] off."
Some other students asked in class how we should know when we have hit this "garbage collection," and he said that: "The timing for the second allocation should be greater than the first because of the time taken by garbage collection"
After searching around a bit, the closest thing I could find to disabling virtual memory was to disable swap memory with swapoff -a
. I disabled my desktop environment and compiled and ran my program from the native terminal (to avoid possible interference from other processes, especially a heavy one like the Desktop Environment). I did this and ran my program with increasing m
until I reached a point where the timing for the second allocation was greater than the first.
I ran the program with increasing m
and eventually found a point where the time for the second allocation was more than the time for the first allocation. Along the way however I hit a point where the process was killed before the second allocation. I checked dmesg
and saw that it was killed by oom
-killer. I found and read several articles about about oom
-killer and discovered that you could disable over allocation of memory by the kernel.
I did this and ran my program again, only this time I was not able to find a m
such that the timing of the second was higher than the first. Eventually with larger and larger m (though much smaller than when overallocation was enabled) malloc would fail and my program would terminate.
I have three questions, the first of which isn't really that important:
Is garbage collection the correct term for this? My professor is very adamant in saying that this is garbage collection, but I was under the assumption that garbage collection was something done by programming languages and that this would be considered more defragmenting.
Is compaction like he wants possible on a linux system?
Why was I able to reach a point where the time for the second allocation was higher than the first when I disable swap but still had overallocation of memory enabled? Did compaction actually take place? If so why was I not able to hit a point where compaction happened after I disabled overallocation of memory?
memory swap defragmentation
I am running Fedora 26
.
This is for a very strange assignment given by my algorithms professor. The assignment says:
Memory fragmentation in C:
Design, implement, and execute a C-program that does the following: It allocates memory for a sequence of3m
arrays size 800,000 elements each; then it explicitly deallocates all even-numbered arrays and allocates a sequence ofm
arrays of size 900,000 elements each. Measure the amounts of time your program requires for the allocation of the first sequence and for the second sequence. Choosem
to exhaust almost all of main memory available to your program."
The overall goal of this is to fragment memory then request slightly more than what is available as a contiguous chunk, forcing the operating system to compact or defragment memory.
In class I asked how we should go about doing this since memory is visualized and not actually contiguous, to which he replied: "Well you'll have to turn [virtual memory] off."
Some other students asked in class how we should know when we have hit this "garbage collection," and he said that: "The timing for the second allocation should be greater than the first because of the time taken by garbage collection"
After searching around a bit, the closest thing I could find to disabling virtual memory was to disable swap memory with swapoff -a
. I disabled my desktop environment and compiled and ran my program from the native terminal (to avoid possible interference from other processes, especially a heavy one like the Desktop Environment). I did this and ran my program with increasing m
until I reached a point where the timing for the second allocation was greater than the first.
I ran the program with increasing m
and eventually found a point where the time for the second allocation was more than the time for the first allocation. Along the way however I hit a point where the process was killed before the second allocation. I checked dmesg
and saw that it was killed by oom
-killer. I found and read several articles about about oom
-killer and discovered that you could disable over allocation of memory by the kernel.
I did this and ran my program again, only this time I was not able to find a m
such that the timing of the second was higher than the first. Eventually with larger and larger m (though much smaller than when overallocation was enabled) malloc would fail and my program would terminate.
I have three questions, the first of which isn't really that important:
Is garbage collection the correct term for this? My professor is very adamant in saying that this is garbage collection, but I was under the assumption that garbage collection was something done by programming languages and that this would be considered more defragmenting.
Is compaction like he wants possible on a linux system?
Why was I able to reach a point where the time for the second allocation was higher than the first when I disable swap but still had overallocation of memory enabled? Did compaction actually take place? If so why was I not able to hit a point where compaction happened after I disabled overallocation of memory?
memory swap defragmentation
edited Feb 8 at 6:07
asked Feb 8 at 5:10
xylafur
435
435
You cannot "turn off" virtual memory. Also, be aware that in Linux you can logically allocate more memory than you actually have -- the kernel won't actually allocate the pages until you write to them.
â Andy Dalton
Feb 8 at 6:29
3
I was expecting yet another Write my homework! question. But rather this is a My homework seems badly specified, ill-thought, and impossible. Is it? question. Some of this is Stack Overflow territory, and you will find a lot of Q&As there along the lines of (just to pick one example at random) stackoverflow.com/questions/4039274 .
â JdeBP
Feb 8 at 7:12
add a comment |Â
You cannot "turn off" virtual memory. Also, be aware that in Linux you can logically allocate more memory than you actually have -- the kernel won't actually allocate the pages until you write to them.
â Andy Dalton
Feb 8 at 6:29
3
I was expecting yet another Write my homework! question. But rather this is a My homework seems badly specified, ill-thought, and impossible. Is it? question. Some of this is Stack Overflow territory, and you will find a lot of Q&As there along the lines of (just to pick one example at random) stackoverflow.com/questions/4039274 .
â JdeBP
Feb 8 at 7:12
You cannot "turn off" virtual memory. Also, be aware that in Linux you can logically allocate more memory than you actually have -- the kernel won't actually allocate the pages until you write to them.
â Andy Dalton
Feb 8 at 6:29
You cannot "turn off" virtual memory. Also, be aware that in Linux you can logically allocate more memory than you actually have -- the kernel won't actually allocate the pages until you write to them.
â Andy Dalton
Feb 8 at 6:29
3
3
I was expecting yet another Write my homework! question. But rather this is a My homework seems badly specified, ill-thought, and impossible. Is it? question. Some of this is Stack Overflow territory, and you will find a lot of Q&As there along the lines of (just to pick one example at random) stackoverflow.com/questions/4039274 .
â JdeBP
Feb 8 at 7:12
I was expecting yet another Write my homework! question. But rather this is a My homework seems badly specified, ill-thought, and impossible. Is it? question. Some of this is Stack Overflow territory, and you will find a lot of Q&As there along the lines of (just to pick one example at random) stackoverflow.com/questions/4039274 .
â JdeBP
Feb 8 at 7:12
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
5
down vote
accepted
Kudos on your research so far, this is indeed an interesting set of questions.
ThereâÂÂs an important aspect to consider here generally: memory allocation is partly the operating systemâÂÂs responsibility, and partly each running processâÂÂs responsibility (ignoring old systems without memory protection and virtual address spaces). The operating system takes care of providing each process with its own address space, and allocating physical memory to processes when necessary. Each process takes care of carving up its address space (to some extent) and ensuring itâÂÂs used appropriately. Note that a processâÂÂs responsibility will be largely invisible to programmers, since the runtime environment takes care of most things.
Now, to answer your questions...
In my mind garbage collection is one step removed from what youâÂÂre doing here. I imagine youâÂÂre writing in C, using
malloc()
andfree()
. Garbage collection, where supported by the programming language and runtime environment, takes care of the latter part: it identifies blocks of memory which were previously allocated but are no longer in use (and, importantly, can never again be used), and returns them to the allocator. The question linked in JdeBPâÂÂs comment provides some background, but I find it mostly interesting because it demonstrates that different people have very different opinions on garbage collection, and even what constitutes garbage collection.In the context weâÂÂre interested in, I would use âÂÂmemory compactionâ to talk about the process under discussion.
From a userspace programming perspective, what your professor is asking for isnâÂÂt possible, in C, under Linux, for one simple reason: what we care about here isnâÂÂt physical memory fragmentation, itâÂÂs address space fragmentation. When you allocate your many 800,000-byte blocks, youâÂÂll end up with just as many pointers to each block. On Linux, at this point, the operating system itself hasnâÂÂt done much, and you donâÂÂt necessarily have physical memory backing each allocation (as an aside, with smaller allocations the operating system wouldnâÂÂt be involved at all, just your C libraryâÂÂs allocator; but the allocations here are large enough that the C library will use
mmap
, which is handled by the kernel). When you free the odd-numbered blocks, you get back those blocks of address space, but you canâÂÂt change the pointers you have to the other blocks. If you print the pointers out as you go, youâÂÂll see the difference between them is not much more than the allocation request (802,816 bytes on my system); thereâÂÂs no room between two pointers for a 900,000 byte block. Because your program has actual pointers to each block, rather than some more abstract value (in other contexts, a handle), the runtime environment canâÂÂt do anything about that, and so it canâÂÂt compact its memory to coalesce free blocks.If you use a programming language where pointers arenâÂÂt a programmer-visible concept, then memory compaction is possible, under Linux. Another possibility would be to use a memory allocation API where the returned values arenâÂÂt pointers; see for example the handle-based heap allocation functions under Windows (where pointers are only valid when a handle is locked).
Your professorâÂÂs exercise is effectively measuring the performance of
mmap
, which includes its free-block-walking algorithm. You first allocate 3ààm blocks, then free half of them, and then start allocating m blocks again; freeing all those blocks dumps a huge load of free blocks on the kernelâÂÂs allocator, which it needs to keep track of (and the time taken by thefree
calls shows that thereâÂÂs no optimisation being done at this point). If you track the allocation times of each individual block, youâÂÂll then see that the first 900k allocation takes much, much longer than the others (three orders of magnitude on my system), the second is much faster but still takes a lot longer (two orders of magnitude), and the third allocation is back to typical performance levels. So thereâÂÂs something going on, but the returned pointers show that it wasnâÂÂt memory compaction, at least not allocated block compaction (which, as explained above, is impossible) â presumably the time corresponds to time processing the data structures the kernel uses to keep track of the available address space in the process (IâÂÂm checking this and will update later). These lengthy allocations can grow to dwarf the overall allocation sequences youâÂÂre measuring, which is when the 900k allocations end up taking longer overall than the 800k allocations.The reason overcommit changes the behaviour you see is that it changes the exercise from purely manipulating address space, to actually allocating memory, and thus reduces the size of your playground. When you can overcommit, the kernel is only limited by your processâÂÂs address space, so you can allocate far more blocks and put much more pressure on the allocator. When you disable overcommit, the kernel is limited by the available memory, which reduces the value you can have for
m
down to levels where the allocator isnâÂÂt stressed enough for the allocation times to blow up.
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).calloc()
with large allocations behaves the same asmalloc()
on Linux, usingmmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).
â Stephen Kitt
Feb 8 at 9:48
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
Kudos on your research so far, this is indeed an interesting set of questions.
ThereâÂÂs an important aspect to consider here generally: memory allocation is partly the operating systemâÂÂs responsibility, and partly each running processâÂÂs responsibility (ignoring old systems without memory protection and virtual address spaces). The operating system takes care of providing each process with its own address space, and allocating physical memory to processes when necessary. Each process takes care of carving up its address space (to some extent) and ensuring itâÂÂs used appropriately. Note that a processâÂÂs responsibility will be largely invisible to programmers, since the runtime environment takes care of most things.
Now, to answer your questions...
In my mind garbage collection is one step removed from what youâÂÂre doing here. I imagine youâÂÂre writing in C, using
malloc()
andfree()
. Garbage collection, where supported by the programming language and runtime environment, takes care of the latter part: it identifies blocks of memory which were previously allocated but are no longer in use (and, importantly, can never again be used), and returns them to the allocator. The question linked in JdeBPâÂÂs comment provides some background, but I find it mostly interesting because it demonstrates that different people have very different opinions on garbage collection, and even what constitutes garbage collection.In the context weâÂÂre interested in, I would use âÂÂmemory compactionâ to talk about the process under discussion.
From a userspace programming perspective, what your professor is asking for isnâÂÂt possible, in C, under Linux, for one simple reason: what we care about here isnâÂÂt physical memory fragmentation, itâÂÂs address space fragmentation. When you allocate your many 800,000-byte blocks, youâÂÂll end up with just as many pointers to each block. On Linux, at this point, the operating system itself hasnâÂÂt done much, and you donâÂÂt necessarily have physical memory backing each allocation (as an aside, with smaller allocations the operating system wouldnâÂÂt be involved at all, just your C libraryâÂÂs allocator; but the allocations here are large enough that the C library will use
mmap
, which is handled by the kernel). When you free the odd-numbered blocks, you get back those blocks of address space, but you canâÂÂt change the pointers you have to the other blocks. If you print the pointers out as you go, youâÂÂll see the difference between them is not much more than the allocation request (802,816 bytes on my system); thereâÂÂs no room between two pointers for a 900,000 byte block. Because your program has actual pointers to each block, rather than some more abstract value (in other contexts, a handle), the runtime environment canâÂÂt do anything about that, and so it canâÂÂt compact its memory to coalesce free blocks.If you use a programming language where pointers arenâÂÂt a programmer-visible concept, then memory compaction is possible, under Linux. Another possibility would be to use a memory allocation API where the returned values arenâÂÂt pointers; see for example the handle-based heap allocation functions under Windows (where pointers are only valid when a handle is locked).
Your professorâÂÂs exercise is effectively measuring the performance of
mmap
, which includes its free-block-walking algorithm. You first allocate 3ààm blocks, then free half of them, and then start allocating m blocks again; freeing all those blocks dumps a huge load of free blocks on the kernelâÂÂs allocator, which it needs to keep track of (and the time taken by thefree
calls shows that thereâÂÂs no optimisation being done at this point). If you track the allocation times of each individual block, youâÂÂll then see that the first 900k allocation takes much, much longer than the others (three orders of magnitude on my system), the second is much faster but still takes a lot longer (two orders of magnitude), and the third allocation is back to typical performance levels. So thereâÂÂs something going on, but the returned pointers show that it wasnâÂÂt memory compaction, at least not allocated block compaction (which, as explained above, is impossible) â presumably the time corresponds to time processing the data structures the kernel uses to keep track of the available address space in the process (IâÂÂm checking this and will update later). These lengthy allocations can grow to dwarf the overall allocation sequences youâÂÂre measuring, which is when the 900k allocations end up taking longer overall than the 800k allocations.The reason overcommit changes the behaviour you see is that it changes the exercise from purely manipulating address space, to actually allocating memory, and thus reduces the size of your playground. When you can overcommit, the kernel is only limited by your processâÂÂs address space, so you can allocate far more blocks and put much more pressure on the allocator. When you disable overcommit, the kernel is limited by the available memory, which reduces the value you can have for
m
down to levels where the allocator isnâÂÂt stressed enough for the allocation times to blow up.
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).calloc()
with large allocations behaves the same asmalloc()
on Linux, usingmmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).
â Stephen Kitt
Feb 8 at 9:48
add a comment |Â
up vote
5
down vote
accepted
Kudos on your research so far, this is indeed an interesting set of questions.
ThereâÂÂs an important aspect to consider here generally: memory allocation is partly the operating systemâÂÂs responsibility, and partly each running processâÂÂs responsibility (ignoring old systems without memory protection and virtual address spaces). The operating system takes care of providing each process with its own address space, and allocating physical memory to processes when necessary. Each process takes care of carving up its address space (to some extent) and ensuring itâÂÂs used appropriately. Note that a processâÂÂs responsibility will be largely invisible to programmers, since the runtime environment takes care of most things.
Now, to answer your questions...
In my mind garbage collection is one step removed from what youâÂÂre doing here. I imagine youâÂÂre writing in C, using
malloc()
andfree()
. Garbage collection, where supported by the programming language and runtime environment, takes care of the latter part: it identifies blocks of memory which were previously allocated but are no longer in use (and, importantly, can never again be used), and returns them to the allocator. The question linked in JdeBPâÂÂs comment provides some background, but I find it mostly interesting because it demonstrates that different people have very different opinions on garbage collection, and even what constitutes garbage collection.In the context weâÂÂre interested in, I would use âÂÂmemory compactionâ to talk about the process under discussion.
From a userspace programming perspective, what your professor is asking for isnâÂÂt possible, in C, under Linux, for one simple reason: what we care about here isnâÂÂt physical memory fragmentation, itâÂÂs address space fragmentation. When you allocate your many 800,000-byte blocks, youâÂÂll end up with just as many pointers to each block. On Linux, at this point, the operating system itself hasnâÂÂt done much, and you donâÂÂt necessarily have physical memory backing each allocation (as an aside, with smaller allocations the operating system wouldnâÂÂt be involved at all, just your C libraryâÂÂs allocator; but the allocations here are large enough that the C library will use
mmap
, which is handled by the kernel). When you free the odd-numbered blocks, you get back those blocks of address space, but you canâÂÂt change the pointers you have to the other blocks. If you print the pointers out as you go, youâÂÂll see the difference between them is not much more than the allocation request (802,816 bytes on my system); thereâÂÂs no room between two pointers for a 900,000 byte block. Because your program has actual pointers to each block, rather than some more abstract value (in other contexts, a handle), the runtime environment canâÂÂt do anything about that, and so it canâÂÂt compact its memory to coalesce free blocks.If you use a programming language where pointers arenâÂÂt a programmer-visible concept, then memory compaction is possible, under Linux. Another possibility would be to use a memory allocation API where the returned values arenâÂÂt pointers; see for example the handle-based heap allocation functions under Windows (where pointers are only valid when a handle is locked).
Your professorâÂÂs exercise is effectively measuring the performance of
mmap
, which includes its free-block-walking algorithm. You first allocate 3ààm blocks, then free half of them, and then start allocating m blocks again; freeing all those blocks dumps a huge load of free blocks on the kernelâÂÂs allocator, which it needs to keep track of (and the time taken by thefree
calls shows that thereâÂÂs no optimisation being done at this point). If you track the allocation times of each individual block, youâÂÂll then see that the first 900k allocation takes much, much longer than the others (three orders of magnitude on my system), the second is much faster but still takes a lot longer (two orders of magnitude), and the third allocation is back to typical performance levels. So thereâÂÂs something going on, but the returned pointers show that it wasnâÂÂt memory compaction, at least not allocated block compaction (which, as explained above, is impossible) â presumably the time corresponds to time processing the data structures the kernel uses to keep track of the available address space in the process (IâÂÂm checking this and will update later). These lengthy allocations can grow to dwarf the overall allocation sequences youâÂÂre measuring, which is when the 900k allocations end up taking longer overall than the 800k allocations.The reason overcommit changes the behaviour you see is that it changes the exercise from purely manipulating address space, to actually allocating memory, and thus reduces the size of your playground. When you can overcommit, the kernel is only limited by your processâÂÂs address space, so you can allocate far more blocks and put much more pressure on the allocator. When you disable overcommit, the kernel is limited by the available memory, which reduces the value you can have for
m
down to levels where the allocator isnâÂÂt stressed enough for the allocation times to blow up.
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).calloc()
with large allocations behaves the same asmalloc()
on Linux, usingmmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).
â Stephen Kitt
Feb 8 at 9:48
add a comment |Â
up vote
5
down vote
accepted
up vote
5
down vote
accepted
Kudos on your research so far, this is indeed an interesting set of questions.
ThereâÂÂs an important aspect to consider here generally: memory allocation is partly the operating systemâÂÂs responsibility, and partly each running processâÂÂs responsibility (ignoring old systems without memory protection and virtual address spaces). The operating system takes care of providing each process with its own address space, and allocating physical memory to processes when necessary. Each process takes care of carving up its address space (to some extent) and ensuring itâÂÂs used appropriately. Note that a processâÂÂs responsibility will be largely invisible to programmers, since the runtime environment takes care of most things.
Now, to answer your questions...
In my mind garbage collection is one step removed from what youâÂÂre doing here. I imagine youâÂÂre writing in C, using
malloc()
andfree()
. Garbage collection, where supported by the programming language and runtime environment, takes care of the latter part: it identifies blocks of memory which were previously allocated but are no longer in use (and, importantly, can never again be used), and returns them to the allocator. The question linked in JdeBPâÂÂs comment provides some background, but I find it mostly interesting because it demonstrates that different people have very different opinions on garbage collection, and even what constitutes garbage collection.In the context weâÂÂre interested in, I would use âÂÂmemory compactionâ to talk about the process under discussion.
From a userspace programming perspective, what your professor is asking for isnâÂÂt possible, in C, under Linux, for one simple reason: what we care about here isnâÂÂt physical memory fragmentation, itâÂÂs address space fragmentation. When you allocate your many 800,000-byte blocks, youâÂÂll end up with just as many pointers to each block. On Linux, at this point, the operating system itself hasnâÂÂt done much, and you donâÂÂt necessarily have physical memory backing each allocation (as an aside, with smaller allocations the operating system wouldnâÂÂt be involved at all, just your C libraryâÂÂs allocator; but the allocations here are large enough that the C library will use
mmap
, which is handled by the kernel). When you free the odd-numbered blocks, you get back those blocks of address space, but you canâÂÂt change the pointers you have to the other blocks. If you print the pointers out as you go, youâÂÂll see the difference between them is not much more than the allocation request (802,816 bytes on my system); thereâÂÂs no room between two pointers for a 900,000 byte block. Because your program has actual pointers to each block, rather than some more abstract value (in other contexts, a handle), the runtime environment canâÂÂt do anything about that, and so it canâÂÂt compact its memory to coalesce free blocks.If you use a programming language where pointers arenâÂÂt a programmer-visible concept, then memory compaction is possible, under Linux. Another possibility would be to use a memory allocation API where the returned values arenâÂÂt pointers; see for example the handle-based heap allocation functions under Windows (where pointers are only valid when a handle is locked).
Your professorâÂÂs exercise is effectively measuring the performance of
mmap
, which includes its free-block-walking algorithm. You first allocate 3ààm blocks, then free half of them, and then start allocating m blocks again; freeing all those blocks dumps a huge load of free blocks on the kernelâÂÂs allocator, which it needs to keep track of (and the time taken by thefree
calls shows that thereâÂÂs no optimisation being done at this point). If you track the allocation times of each individual block, youâÂÂll then see that the first 900k allocation takes much, much longer than the others (three orders of magnitude on my system), the second is much faster but still takes a lot longer (two orders of magnitude), and the third allocation is back to typical performance levels. So thereâÂÂs something going on, but the returned pointers show that it wasnâÂÂt memory compaction, at least not allocated block compaction (which, as explained above, is impossible) â presumably the time corresponds to time processing the data structures the kernel uses to keep track of the available address space in the process (IâÂÂm checking this and will update later). These lengthy allocations can grow to dwarf the overall allocation sequences youâÂÂre measuring, which is when the 900k allocations end up taking longer overall than the 800k allocations.The reason overcommit changes the behaviour you see is that it changes the exercise from purely manipulating address space, to actually allocating memory, and thus reduces the size of your playground. When you can overcommit, the kernel is only limited by your processâÂÂs address space, so you can allocate far more blocks and put much more pressure on the allocator. When you disable overcommit, the kernel is limited by the available memory, which reduces the value you can have for
m
down to levels where the allocator isnâÂÂt stressed enough for the allocation times to blow up.
Kudos on your research so far, this is indeed an interesting set of questions.
ThereâÂÂs an important aspect to consider here generally: memory allocation is partly the operating systemâÂÂs responsibility, and partly each running processâÂÂs responsibility (ignoring old systems without memory protection and virtual address spaces). The operating system takes care of providing each process with its own address space, and allocating physical memory to processes when necessary. Each process takes care of carving up its address space (to some extent) and ensuring itâÂÂs used appropriately. Note that a processâÂÂs responsibility will be largely invisible to programmers, since the runtime environment takes care of most things.
Now, to answer your questions...
In my mind garbage collection is one step removed from what youâÂÂre doing here. I imagine youâÂÂre writing in C, using
malloc()
andfree()
. Garbage collection, where supported by the programming language and runtime environment, takes care of the latter part: it identifies blocks of memory which were previously allocated but are no longer in use (and, importantly, can never again be used), and returns them to the allocator. The question linked in JdeBPâÂÂs comment provides some background, but I find it mostly interesting because it demonstrates that different people have very different opinions on garbage collection, and even what constitutes garbage collection.In the context weâÂÂre interested in, I would use âÂÂmemory compactionâ to talk about the process under discussion.
From a userspace programming perspective, what your professor is asking for isnâÂÂt possible, in C, under Linux, for one simple reason: what we care about here isnâÂÂt physical memory fragmentation, itâÂÂs address space fragmentation. When you allocate your many 800,000-byte blocks, youâÂÂll end up with just as many pointers to each block. On Linux, at this point, the operating system itself hasnâÂÂt done much, and you donâÂÂt necessarily have physical memory backing each allocation (as an aside, with smaller allocations the operating system wouldnâÂÂt be involved at all, just your C libraryâÂÂs allocator; but the allocations here are large enough that the C library will use
mmap
, which is handled by the kernel). When you free the odd-numbered blocks, you get back those blocks of address space, but you canâÂÂt change the pointers you have to the other blocks. If you print the pointers out as you go, youâÂÂll see the difference between them is not much more than the allocation request (802,816 bytes on my system); thereâÂÂs no room between two pointers for a 900,000 byte block. Because your program has actual pointers to each block, rather than some more abstract value (in other contexts, a handle), the runtime environment canâÂÂt do anything about that, and so it canâÂÂt compact its memory to coalesce free blocks.If you use a programming language where pointers arenâÂÂt a programmer-visible concept, then memory compaction is possible, under Linux. Another possibility would be to use a memory allocation API where the returned values arenâÂÂt pointers; see for example the handle-based heap allocation functions under Windows (where pointers are only valid when a handle is locked).
Your professorâÂÂs exercise is effectively measuring the performance of
mmap
, which includes its free-block-walking algorithm. You first allocate 3ààm blocks, then free half of them, and then start allocating m blocks again; freeing all those blocks dumps a huge load of free blocks on the kernelâÂÂs allocator, which it needs to keep track of (and the time taken by thefree
calls shows that thereâÂÂs no optimisation being done at this point). If you track the allocation times of each individual block, youâÂÂll then see that the first 900k allocation takes much, much longer than the others (three orders of magnitude on my system), the second is much faster but still takes a lot longer (two orders of magnitude), and the third allocation is back to typical performance levels. So thereâÂÂs something going on, but the returned pointers show that it wasnâÂÂt memory compaction, at least not allocated block compaction (which, as explained above, is impossible) â presumably the time corresponds to time processing the data structures the kernel uses to keep track of the available address space in the process (IâÂÂm checking this and will update later). These lengthy allocations can grow to dwarf the overall allocation sequences youâÂÂre measuring, which is when the 900k allocations end up taking longer overall than the 800k allocations.The reason overcommit changes the behaviour you see is that it changes the exercise from purely manipulating address space, to actually allocating memory, and thus reduces the size of your playground. When you can overcommit, the kernel is only limited by your processâÂÂs address space, so you can allocate far more blocks and put much more pressure on the allocator. When you disable overcommit, the kernel is limited by the available memory, which reduces the value you can have for
m
down to levels where the allocator isnâÂÂt stressed enough for the allocation times to blow up.
answered Feb 8 at 9:18
Stephen Kitt
142k22308369
142k22308369
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).calloc()
with large allocations behaves the same asmalloc()
on Linux, usingmmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).
â Stephen Kitt
Feb 8 at 9:48
add a comment |Â
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).calloc()
with large allocations behaves the same asmalloc()
on Linux, usingmmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).
â Stephen Kitt
Feb 8 at 9:48
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Would using calloc() or actually writing to the allocated arrays make any difference here?
â Kusalananda
Feb 8 at 9:35
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).
calloc()
with large allocations behaves the same as malloc()
on Linux, using mmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).â Stephen Kitt
Feb 8 at 9:48
Writing to the allocated memory would cancel the overcommit capability, but itâÂÂs hard to deal with because failure then results in the OOM-killer stepping in (and not necessarily killing the over-allocating process).
calloc()
with large allocations behaves the same as malloc()
on Linux, using mmap()
to allocate an anonymous mapping, which is zero-filled when first used (so overcommit still works).â Stephen Kitt
Feb 8 at 9:48
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f422708%2fhow-to-cause-the-kernel-to-compact-fragmented-memory%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
You cannot "turn off" virtual memory. Also, be aware that in Linux you can logically allocate more memory than you actually have -- the kernel won't actually allocate the pages until you write to them.
â Andy Dalton
Feb 8 at 6:29
3
I was expecting yet another Write my homework! question. But rather this is a My homework seems badly specified, ill-thought, and impossible. Is it? question. Some of this is Stack Overflow territory, and you will find a lot of Q&As there along the lines of (just to pick one example at random) stackoverflow.com/questions/4039274 .
â JdeBP
Feb 8 at 7:12