How are stacks part of the problem? Given tail-call optimization stacks shouldn'...

JoshTriplett · on Oct 29, 2014

They shouldn't, but they do tend to have a large quantity of memory backing them that gets added to when grown into. The kernel uses 4k, 8k, or 16k stacks; a quick test on a Linux x86-64 system suggests that userspace has 8M stacks by default.

Hackish test code to recurse infinitely and print the stack pointer until a segfault:

    #include <stdio.h>

    static inline unsigned long get_sp(void)
    {
        unsigned long ret;
        __asm__("mov %%rsp, %0" : "=r" (ret));
        return ret;
    }

    void main(void)
    {
        printf("%lu\n", get_sp());
        main();
    }

Running this and comparing the first and last values shows a stack size of about 8M.

EdiX · on Nov 5, 2014

I didn't see this until now, the problem is that when you spawn a new thread space for the stack needs to be allocated. Since the stack can't be reallocated easily in C/C++ it also needs to be "big enough". The current implementation is to allocate several megabytes of stack for each thread, since memory accounting isn't strict this doesn't create problems because unused stack doesn't really take up space. If you start accounting memory you will also have to be stricter with allocating stacks for new threads, limiting yourself to just a page or two wherever possible. This is not an easy task.