How does stack allocation work in Linux?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
16
down vote

favorite
7












Does the OS reserve the fixed amount of valid virtual space for stack or something else? Am I able to produce a stack overflow just by using big local variables?



I've wrote a small C program to test my assumption. It's running on X86-64 CentOS 6.5.



#include <string.h>
#include <stdio.h>
int main()

int n = 10240 * 1024;
char a[n];
memset(a, 'x', n);
printf("%xn%xn", &a[0], &a[n-1]);
getchar();
return 0;



Running the program gives &a[0] = f0ceabe0 and &a[n-1] = f16eabdf



The proc maps shows the stack: 7ffff0cea000-7ffff16ec000. (10248 * 1024B)



Then I tried to increase n = 11240 * 1024



Running the program gives &a[0] = b6b36690 and &a[n-1] = b763068f



The proc maps shows the stack: 7fffb6b35000-7fffb7633000. (11256 * 1024B)



ulimit -s prints 10240 in my PC.



As you can see, in both case the stack size is bigger than which ulimit -s gives. And the stack grows with bigger local variable. The top of stack is somehow 3-5kB more off &a[0] (AFAIK the red zone is 128B).



So how does this stack map get allocated?










share|improve this question



























    up vote
    16
    down vote

    favorite
    7












    Does the OS reserve the fixed amount of valid virtual space for stack or something else? Am I able to produce a stack overflow just by using big local variables?



    I've wrote a small C program to test my assumption. It's running on X86-64 CentOS 6.5.



    #include <string.h>
    #include <stdio.h>
    int main()

    int n = 10240 * 1024;
    char a[n];
    memset(a, 'x', n);
    printf("%xn%xn", &a[0], &a[n-1]);
    getchar();
    return 0;



    Running the program gives &a[0] = f0ceabe0 and &a[n-1] = f16eabdf



    The proc maps shows the stack: 7ffff0cea000-7ffff16ec000. (10248 * 1024B)



    Then I tried to increase n = 11240 * 1024



    Running the program gives &a[0] = b6b36690 and &a[n-1] = b763068f



    The proc maps shows the stack: 7fffb6b35000-7fffb7633000. (11256 * 1024B)



    ulimit -s prints 10240 in my PC.



    As you can see, in both case the stack size is bigger than which ulimit -s gives. And the stack grows with bigger local variable. The top of stack is somehow 3-5kB more off &a[0] (AFAIK the red zone is 128B).



    So how does this stack map get allocated?










    share|improve this question

























      up vote
      16
      down vote

      favorite
      7









      up vote
      16
      down vote

      favorite
      7






      7





      Does the OS reserve the fixed amount of valid virtual space for stack or something else? Am I able to produce a stack overflow just by using big local variables?



      I've wrote a small C program to test my assumption. It's running on X86-64 CentOS 6.5.



      #include <string.h>
      #include <stdio.h>
      int main()

      int n = 10240 * 1024;
      char a[n];
      memset(a, 'x', n);
      printf("%xn%xn", &a[0], &a[n-1]);
      getchar();
      return 0;



      Running the program gives &a[0] = f0ceabe0 and &a[n-1] = f16eabdf



      The proc maps shows the stack: 7ffff0cea000-7ffff16ec000. (10248 * 1024B)



      Then I tried to increase n = 11240 * 1024



      Running the program gives &a[0] = b6b36690 and &a[n-1] = b763068f



      The proc maps shows the stack: 7fffb6b35000-7fffb7633000. (11256 * 1024B)



      ulimit -s prints 10240 in my PC.



      As you can see, in both case the stack size is bigger than which ulimit -s gives. And the stack grows with bigger local variable. The top of stack is somehow 3-5kB more off &a[0] (AFAIK the red zone is 128B).



      So how does this stack map get allocated?










      share|improve this question















      Does the OS reserve the fixed amount of valid virtual space for stack or something else? Am I able to produce a stack overflow just by using big local variables?



      I've wrote a small C program to test my assumption. It's running on X86-64 CentOS 6.5.



      #include <string.h>
      #include <stdio.h>
      int main()

      int n = 10240 * 1024;
      char a[n];
      memset(a, 'x', n);
      printf("%xn%xn", &a[0], &a[n-1]);
      getchar();
      return 0;



      Running the program gives &a[0] = f0ceabe0 and &a[n-1] = f16eabdf



      The proc maps shows the stack: 7ffff0cea000-7ffff16ec000. (10248 * 1024B)



      Then I tried to increase n = 11240 * 1024



      Running the program gives &a[0] = b6b36690 and &a[n-1] = b763068f



      The proc maps shows the stack: 7fffb6b35000-7fffb7633000. (11256 * 1024B)



      ulimit -s prints 10240 in my PC.



      As you can see, in both case the stack size is bigger than which ulimit -s gives. And the stack grows with bigger local variable. The top of stack is somehow 3-5kB more off &a[0] (AFAIK the red zone is 128B).



      So how does this stack map get allocated?







      linux memory virtual-memory stack






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jul 20 '14 at 23:52









      enedil

      1,27131125




      1,27131125










      asked Jul 20 '14 at 9:48









      Amos

      2421212




      2421212




















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          14
          down vote



          accepted










          It appears that the stack memory limit is not allocated (anyway, it couldn't with unlimited stack). https://www.kernel.org/doc/Documentation/vm/overcommit-accounting says:




          The C language stack growth does an implicit mremap. If you want absolute
          guarantees and run close to the edge you MUST mmap your stack for the
          largest size you think you will need. For typical stack usage this does
          not matter much but it's a corner case if you really really care




          However mmapping the stack would be the goal of a compiler (if it has an option for that).



          EDIT: After some tests on an x84_64 Debian machine, I've found that the stack grows without any system call (according to strace). So, this means that the kernel grows it automatically (this is what the "implicit" means above), i.e. without explicit mmap/mremap from the process.



          It was quite hard to find detailed information confirming this. I recommend Understanding The Linux Virtual Memory Manager by Mel Gorman. I suppose that the answer is in Section 4.6.1 Handling a Page Fault, with the exception "Region not valid but is beside an expandable region like the stack" and the corresponding action "Expand the region and allocate a page". See also D.5.2 Expanding the Stack.



          Other references about Linux memory management (but with almost nothing about the stack):



          • Memory FAQ


          • What every programmer should know about memory by Ulrich Drepper

          EDIT 2: This implementation has a drawback: in corner cases, a stack-heap collision may not be detected, even in the case where the stack would be larger than the limit! The reason is that a write in a variable in the stack may end up in allocated heap memory, in which case there is no page fault and the kernel cannot know that the stack needed to be extended. See my example in the discussion Silent stack-heap collision under GNU/Linux I started in the gcc-help list. To avoid that, the compiler needs to add some code at function call; this can be done with -fstack-check for GCC (see Ian Lance Taylor's reply and the GCC man page for details).






          share|improve this answer






















          • That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
            – Amos
            Jul 20 '14 at 11:19










          • @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
            – vinc17
            Jul 20 '14 at 11:22










          • It would probably be a good idea to mention what mmap is, for people who don't know.
            – Faheem Mitha
            Jul 20 '14 at 13:35










          • @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
            – vinc17
            Jul 20 '14 at 14:42






          • 1




            @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
            – vinc17
            Feb 21 '17 at 16:35

















          up vote
          5
          down vote













          Linux kernel 4.2




          • mm/mmap.c#acct_stack_growth decides if it will segfault or not. It uses rlim[RLIMIT_STACK] which corresponds to the POSIX gerlimit(RLIMIT_STACK)


          • arch/x86/mm/fault.c#do_page_fault is the interrupt handler that starts a chain which ends up calling acct_stack_growth


          • arch/x86/entry/entry_64.S sets up the page fault handler. You need to know a bit about paging to understand that part: How does x86 paging work? | Stack Overflow

          Minimal test program



          We can then test it up with a minimal NASM 64-bit program:



          global _start
          _start:
          sub rsp, 0x7FF000
          mov [rsp], rax
          mov rax, 60
          mov rdi, 0
          syscall


          Make sure that you turn off ASLR and remove environment variables as those will go on the stack and take up space:



          echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
          env -i ./main.out


          The limit is somewhere slightly below my ulimit -s (8MiB for me). Looks like this is because of extra System V specified data initially put on the stack in addition to the environment: Linux 64 command line parameters in Assembly | Stack Overflow



          If you are serious about this, TODO make a minimal initrd image that starts writing from the stack top and goes down, and then run it with QEMU + GDB. Put a dprintf on the loop printing the stack address, and a breakpoint at acct_stack_growth. It will be glorious.



          Related:



          • https://softwareengineering.stackexchange.com/questions/207386/how-are-the-size-of-the-stack-and-heap-limited-by-the-os

          • Where is the stack memory allocated from for a Linux process? | Stack Overflow

          • What is the Linux Stack? | Stack Overflow





          share|improve this answer





























            up vote
            2
            down vote













            By default, the maximal stack size is configured to be 8MB per process,

            but it can be changed using ulimit:



            Showing the default in kB:



            $ ulimit -s
            8192


            Set to unlimited:



            ulimit -s unlimited



            affecting the current shell and subshells and their child processes.

            (ulimit is a shell builtin command)



            You can show the actual stack address range in use with:
            cat /proc/$PID/maps | grep -F '[stack]'

            on Linux.






            share|improve this answer






















            • So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
              – Amos
              Jul 20 '14 at 10:27











            • Try to set the last element too. And make sure that they are not optimized away.
              – vinc17
              Jul 20 '14 at 10:34










            • @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
              – Volker Siegel
              Jul 20 '14 at 10:56











            • @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
              – goldilocks
              Jul 20 '14 at 11:05







            • 2




              @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
              – vinc17
              Jul 20 '14 at 12:20










            Your Answer







            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "106"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f145557%2fhow-does-stack-allocation-work-in-linux%23new-answer', 'question_page');

            );

            Post as a guest






























            3 Answers
            3






            active

            oldest

            votes








            3 Answers
            3






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            14
            down vote



            accepted










            It appears that the stack memory limit is not allocated (anyway, it couldn't with unlimited stack). https://www.kernel.org/doc/Documentation/vm/overcommit-accounting says:




            The C language stack growth does an implicit mremap. If you want absolute
            guarantees and run close to the edge you MUST mmap your stack for the
            largest size you think you will need. For typical stack usage this does
            not matter much but it's a corner case if you really really care




            However mmapping the stack would be the goal of a compiler (if it has an option for that).



            EDIT: After some tests on an x84_64 Debian machine, I've found that the stack grows without any system call (according to strace). So, this means that the kernel grows it automatically (this is what the "implicit" means above), i.e. without explicit mmap/mremap from the process.



            It was quite hard to find detailed information confirming this. I recommend Understanding The Linux Virtual Memory Manager by Mel Gorman. I suppose that the answer is in Section 4.6.1 Handling a Page Fault, with the exception "Region not valid but is beside an expandable region like the stack" and the corresponding action "Expand the region and allocate a page". See also D.5.2 Expanding the Stack.



            Other references about Linux memory management (but with almost nothing about the stack):



            • Memory FAQ


            • What every programmer should know about memory by Ulrich Drepper

            EDIT 2: This implementation has a drawback: in corner cases, a stack-heap collision may not be detected, even in the case where the stack would be larger than the limit! The reason is that a write in a variable in the stack may end up in allocated heap memory, in which case there is no page fault and the kernel cannot know that the stack needed to be extended. See my example in the discussion Silent stack-heap collision under GNU/Linux I started in the gcc-help list. To avoid that, the compiler needs to add some code at function call; this can be done with -fstack-check for GCC (see Ian Lance Taylor's reply and the GCC man page for details).






            share|improve this answer






















            • That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
              – Amos
              Jul 20 '14 at 11:19










            • @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
              – vinc17
              Jul 20 '14 at 11:22










            • It would probably be a good idea to mention what mmap is, for people who don't know.
              – Faheem Mitha
              Jul 20 '14 at 13:35










            • @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
              – vinc17
              Jul 20 '14 at 14:42






            • 1




              @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
              – vinc17
              Feb 21 '17 at 16:35














            up vote
            14
            down vote



            accepted










            It appears that the stack memory limit is not allocated (anyway, it couldn't with unlimited stack). https://www.kernel.org/doc/Documentation/vm/overcommit-accounting says:




            The C language stack growth does an implicit mremap. If you want absolute
            guarantees and run close to the edge you MUST mmap your stack for the
            largest size you think you will need. For typical stack usage this does
            not matter much but it's a corner case if you really really care




            However mmapping the stack would be the goal of a compiler (if it has an option for that).



            EDIT: After some tests on an x84_64 Debian machine, I've found that the stack grows without any system call (according to strace). So, this means that the kernel grows it automatically (this is what the "implicit" means above), i.e. without explicit mmap/mremap from the process.



            It was quite hard to find detailed information confirming this. I recommend Understanding The Linux Virtual Memory Manager by Mel Gorman. I suppose that the answer is in Section 4.6.1 Handling a Page Fault, with the exception "Region not valid but is beside an expandable region like the stack" and the corresponding action "Expand the region and allocate a page". See also D.5.2 Expanding the Stack.



            Other references about Linux memory management (but with almost nothing about the stack):



            • Memory FAQ


            • What every programmer should know about memory by Ulrich Drepper

            EDIT 2: This implementation has a drawback: in corner cases, a stack-heap collision may not be detected, even in the case where the stack would be larger than the limit! The reason is that a write in a variable in the stack may end up in allocated heap memory, in which case there is no page fault and the kernel cannot know that the stack needed to be extended. See my example in the discussion Silent stack-heap collision under GNU/Linux I started in the gcc-help list. To avoid that, the compiler needs to add some code at function call; this can be done with -fstack-check for GCC (see Ian Lance Taylor's reply and the GCC man page for details).






            share|improve this answer






















            • That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
              – Amos
              Jul 20 '14 at 11:19










            • @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
              – vinc17
              Jul 20 '14 at 11:22










            • It would probably be a good idea to mention what mmap is, for people who don't know.
              – Faheem Mitha
              Jul 20 '14 at 13:35










            • @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
              – vinc17
              Jul 20 '14 at 14:42






            • 1




              @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
              – vinc17
              Feb 21 '17 at 16:35












            up vote
            14
            down vote



            accepted







            up vote
            14
            down vote



            accepted






            It appears that the stack memory limit is not allocated (anyway, it couldn't with unlimited stack). https://www.kernel.org/doc/Documentation/vm/overcommit-accounting says:




            The C language stack growth does an implicit mremap. If you want absolute
            guarantees and run close to the edge you MUST mmap your stack for the
            largest size you think you will need. For typical stack usage this does
            not matter much but it's a corner case if you really really care




            However mmapping the stack would be the goal of a compiler (if it has an option for that).



            EDIT: After some tests on an x84_64 Debian machine, I've found that the stack grows without any system call (according to strace). So, this means that the kernel grows it automatically (this is what the "implicit" means above), i.e. without explicit mmap/mremap from the process.



            It was quite hard to find detailed information confirming this. I recommend Understanding The Linux Virtual Memory Manager by Mel Gorman. I suppose that the answer is in Section 4.6.1 Handling a Page Fault, with the exception "Region not valid but is beside an expandable region like the stack" and the corresponding action "Expand the region and allocate a page". See also D.5.2 Expanding the Stack.



            Other references about Linux memory management (but with almost nothing about the stack):



            • Memory FAQ


            • What every programmer should know about memory by Ulrich Drepper

            EDIT 2: This implementation has a drawback: in corner cases, a stack-heap collision may not be detected, even in the case where the stack would be larger than the limit! The reason is that a write in a variable in the stack may end up in allocated heap memory, in which case there is no page fault and the kernel cannot know that the stack needed to be extended. See my example in the discussion Silent stack-heap collision under GNU/Linux I started in the gcc-help list. To avoid that, the compiler needs to add some code at function call; this can be done with -fstack-check for GCC (see Ian Lance Taylor's reply and the GCC man page for details).






            share|improve this answer














            It appears that the stack memory limit is not allocated (anyway, it couldn't with unlimited stack). https://www.kernel.org/doc/Documentation/vm/overcommit-accounting says:




            The C language stack growth does an implicit mremap. If you want absolute
            guarantees and run close to the edge you MUST mmap your stack for the
            largest size you think you will need. For typical stack usage this does
            not matter much but it's a corner case if you really really care




            However mmapping the stack would be the goal of a compiler (if it has an option for that).



            EDIT: After some tests on an x84_64 Debian machine, I've found that the stack grows without any system call (according to strace). So, this means that the kernel grows it automatically (this is what the "implicit" means above), i.e. without explicit mmap/mremap from the process.



            It was quite hard to find detailed information confirming this. I recommend Understanding The Linux Virtual Memory Manager by Mel Gorman. I suppose that the answer is in Section 4.6.1 Handling a Page Fault, with the exception "Region not valid but is beside an expandable region like the stack" and the corresponding action "Expand the region and allocate a page". See also D.5.2 Expanding the Stack.



            Other references about Linux memory management (but with almost nothing about the stack):



            • Memory FAQ


            • What every programmer should know about memory by Ulrich Drepper

            EDIT 2: This implementation has a drawback: in corner cases, a stack-heap collision may not be detected, even in the case where the stack would be larger than the limit! The reason is that a write in a variable in the stack may end up in allocated heap memory, in which case there is no page fault and the kernel cannot know that the stack needed to be extended. See my example in the discussion Silent stack-heap collision under GNU/Linux I started in the gcc-help list. To avoid that, the compiler needs to add some code at function call; this can be done with -fstack-check for GCC (see Ian Lance Taylor's reply and the GCC man page for details).







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Jul 20 '14 at 23:59









            enedil

            1,27131125




            1,27131125










            answered Jul 20 '14 at 10:51









            vinc17

            8,6491635




            8,6491635











            • That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
              – Amos
              Jul 20 '14 at 11:19










            • @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
              – vinc17
              Jul 20 '14 at 11:22










            • It would probably be a good idea to mention what mmap is, for people who don't know.
              – Faheem Mitha
              Jul 20 '14 at 13:35










            • @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
              – vinc17
              Jul 20 '14 at 14:42






            • 1




              @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
              – vinc17
              Feb 21 '17 at 16:35
















            • That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
              – Amos
              Jul 20 '14 at 11:19










            • @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
              – vinc17
              Jul 20 '14 at 11:22










            • It would probably be a good idea to mention what mmap is, for people who don't know.
              – Faheem Mitha
              Jul 20 '14 at 13:35










            • @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
              – vinc17
              Jul 20 '14 at 14:42






            • 1




              @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
              – vinc17
              Feb 21 '17 at 16:35















            That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
            – Amos
            Jul 20 '14 at 11:19




            That seems the correct answer to my question. But it confuses me more. When will the mremap call get triggered? Will it be a syscall built into the program?
            – Amos
            Jul 20 '14 at 11:19












            @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
            – vinc17
            Jul 20 '14 at 11:22




            @amos I assume that the mremap call will be triggered if need be at a function call or when alloca() is called.
            – vinc17
            Jul 20 '14 at 11:22












            It would probably be a good idea to mention what mmap is, for people who don't know.
            – Faheem Mitha
            Jul 20 '14 at 13:35




            It would probably be a good idea to mention what mmap is, for people who don't know.
            – Faheem Mitha
            Jul 20 '14 at 13:35












            @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
            – vinc17
            Jul 20 '14 at 14:42




            @FaheemMitha I've added some information. For those who don't know what mmap is, see the memory FAQ mentioned above. Here, for the stack, it would have been "anonymous mapping" so that unused space wouldn't take any physical memory, but as explained by Mel Gorman, the kernel does the mapping (virtual memory) and the physical allocation at the same time.
            – vinc17
            Jul 20 '14 at 14:42




            1




            1




            @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
            – vinc17
            Feb 21 '17 at 16:35




            @max I've tried the OP's program with ulimit -s giving 10240, like under the OP's conditions, and I get a SIGSEGV as expected (this is what is required by POSIX: "If this limit is exceeded, SIGSEGV shall be generated for the thread."). I suspect a bug in the OP's kernel.
            – vinc17
            Feb 21 '17 at 16:35












            up vote
            5
            down vote













            Linux kernel 4.2




            • mm/mmap.c#acct_stack_growth decides if it will segfault or not. It uses rlim[RLIMIT_STACK] which corresponds to the POSIX gerlimit(RLIMIT_STACK)


            • arch/x86/mm/fault.c#do_page_fault is the interrupt handler that starts a chain which ends up calling acct_stack_growth


            • arch/x86/entry/entry_64.S sets up the page fault handler. You need to know a bit about paging to understand that part: How does x86 paging work? | Stack Overflow

            Minimal test program



            We can then test it up with a minimal NASM 64-bit program:



            global _start
            _start:
            sub rsp, 0x7FF000
            mov [rsp], rax
            mov rax, 60
            mov rdi, 0
            syscall


            Make sure that you turn off ASLR and remove environment variables as those will go on the stack and take up space:



            echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
            env -i ./main.out


            The limit is somewhere slightly below my ulimit -s (8MiB for me). Looks like this is because of extra System V specified data initially put on the stack in addition to the environment: Linux 64 command line parameters in Assembly | Stack Overflow



            If you are serious about this, TODO make a minimal initrd image that starts writing from the stack top and goes down, and then run it with QEMU + GDB. Put a dprintf on the loop printing the stack address, and a breakpoint at acct_stack_growth. It will be glorious.



            Related:



            • https://softwareengineering.stackexchange.com/questions/207386/how-are-the-size-of-the-stack-and-heap-limited-by-the-os

            • Where is the stack memory allocated from for a Linux process? | Stack Overflow

            • What is the Linux Stack? | Stack Overflow





            share|improve this answer


























              up vote
              5
              down vote













              Linux kernel 4.2




              • mm/mmap.c#acct_stack_growth decides if it will segfault or not. It uses rlim[RLIMIT_STACK] which corresponds to the POSIX gerlimit(RLIMIT_STACK)


              • arch/x86/mm/fault.c#do_page_fault is the interrupt handler that starts a chain which ends up calling acct_stack_growth


              • arch/x86/entry/entry_64.S sets up the page fault handler. You need to know a bit about paging to understand that part: How does x86 paging work? | Stack Overflow

              Minimal test program



              We can then test it up with a minimal NASM 64-bit program:



              global _start
              _start:
              sub rsp, 0x7FF000
              mov [rsp], rax
              mov rax, 60
              mov rdi, 0
              syscall


              Make sure that you turn off ASLR and remove environment variables as those will go on the stack and take up space:



              echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
              env -i ./main.out


              The limit is somewhere slightly below my ulimit -s (8MiB for me). Looks like this is because of extra System V specified data initially put on the stack in addition to the environment: Linux 64 command line parameters in Assembly | Stack Overflow



              If you are serious about this, TODO make a minimal initrd image that starts writing from the stack top and goes down, and then run it with QEMU + GDB. Put a dprintf on the loop printing the stack address, and a breakpoint at acct_stack_growth. It will be glorious.



              Related:



              • https://softwareengineering.stackexchange.com/questions/207386/how-are-the-size-of-the-stack-and-heap-limited-by-the-os

              • Where is the stack memory allocated from for a Linux process? | Stack Overflow

              • What is the Linux Stack? | Stack Overflow





              share|improve this answer
























                up vote
                5
                down vote










                up vote
                5
                down vote









                Linux kernel 4.2




                • mm/mmap.c#acct_stack_growth decides if it will segfault or not. It uses rlim[RLIMIT_STACK] which corresponds to the POSIX gerlimit(RLIMIT_STACK)


                • arch/x86/mm/fault.c#do_page_fault is the interrupt handler that starts a chain which ends up calling acct_stack_growth


                • arch/x86/entry/entry_64.S sets up the page fault handler. You need to know a bit about paging to understand that part: How does x86 paging work? | Stack Overflow

                Minimal test program



                We can then test it up with a minimal NASM 64-bit program:



                global _start
                _start:
                sub rsp, 0x7FF000
                mov [rsp], rax
                mov rax, 60
                mov rdi, 0
                syscall


                Make sure that you turn off ASLR and remove environment variables as those will go on the stack and take up space:



                echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
                env -i ./main.out


                The limit is somewhere slightly below my ulimit -s (8MiB for me). Looks like this is because of extra System V specified data initially put on the stack in addition to the environment: Linux 64 command line parameters in Assembly | Stack Overflow



                If you are serious about this, TODO make a minimal initrd image that starts writing from the stack top and goes down, and then run it with QEMU + GDB. Put a dprintf on the loop printing the stack address, and a breakpoint at acct_stack_growth. It will be glorious.



                Related:



                • https://softwareengineering.stackexchange.com/questions/207386/how-are-the-size-of-the-stack-and-heap-limited-by-the-os

                • Where is the stack memory allocated from for a Linux process? | Stack Overflow

                • What is the Linux Stack? | Stack Overflow





                share|improve this answer














                Linux kernel 4.2




                • mm/mmap.c#acct_stack_growth decides if it will segfault or not. It uses rlim[RLIMIT_STACK] which corresponds to the POSIX gerlimit(RLIMIT_STACK)


                • arch/x86/mm/fault.c#do_page_fault is the interrupt handler that starts a chain which ends up calling acct_stack_growth


                • arch/x86/entry/entry_64.S sets up the page fault handler. You need to know a bit about paging to understand that part: How does x86 paging work? | Stack Overflow

                Minimal test program



                We can then test it up with a minimal NASM 64-bit program:



                global _start
                _start:
                sub rsp, 0x7FF000
                mov [rsp], rax
                mov rax, 60
                mov rdi, 0
                syscall


                Make sure that you turn off ASLR and remove environment variables as those will go on the stack and take up space:



                echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
                env -i ./main.out


                The limit is somewhere slightly below my ulimit -s (8MiB for me). Looks like this is because of extra System V specified data initially put on the stack in addition to the environment: Linux 64 command line parameters in Assembly | Stack Overflow



                If you are serious about this, TODO make a minimal initrd image that starts writing from the stack top and goes down, and then run it with QEMU + GDB. Put a dprintf on the loop printing the stack address, and a breakpoint at acct_stack_growth. It will be glorious.



                Related:



                • https://softwareengineering.stackexchange.com/questions/207386/how-are-the-size-of-the-stack-and-heap-limited-by-the-os

                • Where is the stack memory allocated from for a Linux process? | Stack Overflow

                • What is the Linux Stack? | Stack Overflow






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Sep 2 at 21:02

























                answered Oct 28 '15 at 21:00









                Ciro Santilli 新疆改造中心 六四事件 法轮功

                4,30613936




                4,30613936




















                    up vote
                    2
                    down vote













                    By default, the maximal stack size is configured to be 8MB per process,

                    but it can be changed using ulimit:



                    Showing the default in kB:



                    $ ulimit -s
                    8192


                    Set to unlimited:



                    ulimit -s unlimited



                    affecting the current shell and subshells and their child processes.

                    (ulimit is a shell builtin command)



                    You can show the actual stack address range in use with:
                    cat /proc/$PID/maps | grep -F '[stack]'

                    on Linux.






                    share|improve this answer






















                    • So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
                      – Amos
                      Jul 20 '14 at 10:27











                    • Try to set the last element too. And make sure that they are not optimized away.
                      – vinc17
                      Jul 20 '14 at 10:34










                    • @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
                      – Volker Siegel
                      Jul 20 '14 at 10:56











                    • @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
                      – goldilocks
                      Jul 20 '14 at 11:05







                    • 2




                      @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
                      – vinc17
                      Jul 20 '14 at 12:20














                    up vote
                    2
                    down vote













                    By default, the maximal stack size is configured to be 8MB per process,

                    but it can be changed using ulimit:



                    Showing the default in kB:



                    $ ulimit -s
                    8192


                    Set to unlimited:



                    ulimit -s unlimited



                    affecting the current shell and subshells and their child processes.

                    (ulimit is a shell builtin command)



                    You can show the actual stack address range in use with:
                    cat /proc/$PID/maps | grep -F '[stack]'

                    on Linux.






                    share|improve this answer






















                    • So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
                      – Amos
                      Jul 20 '14 at 10:27











                    • Try to set the last element too. And make sure that they are not optimized away.
                      – vinc17
                      Jul 20 '14 at 10:34










                    • @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
                      – Volker Siegel
                      Jul 20 '14 at 10:56











                    • @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
                      – goldilocks
                      Jul 20 '14 at 11:05







                    • 2




                      @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
                      – vinc17
                      Jul 20 '14 at 12:20












                    up vote
                    2
                    down vote










                    up vote
                    2
                    down vote









                    By default, the maximal stack size is configured to be 8MB per process,

                    but it can be changed using ulimit:



                    Showing the default in kB:



                    $ ulimit -s
                    8192


                    Set to unlimited:



                    ulimit -s unlimited



                    affecting the current shell and subshells and their child processes.

                    (ulimit is a shell builtin command)



                    You can show the actual stack address range in use with:
                    cat /proc/$PID/maps | grep -F '[stack]'

                    on Linux.






                    share|improve this answer














                    By default, the maximal stack size is configured to be 8MB per process,

                    but it can be changed using ulimit:



                    Showing the default in kB:



                    $ ulimit -s
                    8192


                    Set to unlimited:



                    ulimit -s unlimited



                    affecting the current shell and subshells and their child processes.

                    (ulimit is a shell builtin command)



                    You can show the actual stack address range in use with:
                    cat /proc/$PID/maps | grep -F '[stack]'

                    on Linux.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Jul 20 '14 at 13:39









                    Faheem Mitha

                    22.3k1677134




                    22.3k1677134










                    answered Jul 20 '14 at 10:06









                    Volker Siegel

                    10.3k33058




                    10.3k33058











                    • So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
                      – Amos
                      Jul 20 '14 at 10:27











                    • Try to set the last element too. And make sure that they are not optimized away.
                      – vinc17
                      Jul 20 '14 at 10:34










                    • @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
                      – Volker Siegel
                      Jul 20 '14 at 10:56











                    • @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
                      – goldilocks
                      Jul 20 '14 at 11:05







                    • 2




                      @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
                      – vinc17
                      Jul 20 '14 at 12:20
















                    • So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
                      – Amos
                      Jul 20 '14 at 10:27











                    • Try to set the last element too. And make sure that they are not optimized away.
                      – vinc17
                      Jul 20 '14 at 10:34










                    • @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
                      – Volker Siegel
                      Jul 20 '14 at 10:56











                    • @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
                      – goldilocks
                      Jul 20 '14 at 11:05







                    • 2




                      @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
                      – vinc17
                      Jul 20 '14 at 12:20















                    So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
                    – Amos
                    Jul 20 '14 at 10:27





                    So when a program is loaded by the current shell, OS will make a memory segment of ulimit -sKB valid for the program. In my case it's 10240KB. But when I declare a local array char a[10240*1024] and set a[0]=1, the program exits correctly. Why?
                    – Amos
                    Jul 20 '14 at 10:27













                    Try to set the last element too. And make sure that they are not optimized away.
                    – vinc17
                    Jul 20 '14 at 10:34




                    Try to set the last element too. And make sure that they are not optimized away.
                    – vinc17
                    Jul 20 '14 at 10:34












                    @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
                    – Volker Siegel
                    Jul 20 '14 at 10:56





                    @amos I think what vinc17 means is that you named a memory region that would not fit on the stack in your program, but as you do not actually access it in the part that does not fit, the machine never notices that - it does not even get that information.
                    – Volker Siegel
                    Jul 20 '14 at 10:56













                    @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
                    – goldilocks
                    Jul 20 '14 at 11:05





                    @amos Try int n = 10240*1024; char a[n]; memset(a,'x',n); ...seg fault.
                    – goldilocks
                    Jul 20 '14 at 11:05





                    2




                    2




                    @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
                    – vinc17
                    Jul 20 '14 at 12:20




                    @amos So, as you can see, a has not been allocated in your 10MB stack. The compiler might have seen that there couldn't be a recursive call and has done special allocation, or something else like a discontinuous stack or some indirection.
                    – vinc17
                    Jul 20 '14 at 12:20

















                     

                    draft saved


                    draft discarded















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f145557%2fhow-does-stack-allocation-work-in-linux%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Popular posts from this blog

                    Peggy Mitchell

                    Palaiologos

                    The Forum (Inglewood, California)