What happened in my stack?

If you write a program which requires a big chunk of memory at the time, you will notice the influences of ‘stack size’.

I will show you what will happen in this stack limitation with the following simple program.

#include <stdio.h>
#include <malloc.h>

int main() {
	char *a;
	unsigned long i, half_gb = 512 * 1024 * 1024;
	unsigned long max = (unsigned long)3 * 1024 * 1024 * 1024;

	printf("PID = %dnn", getpid());
	for (i = half_gb; i < max; i += half_gb) {
		a = malloc(i);
		printf("a = %p, size = %uMBn", a, i / (1024 * 1024));
		if (a != NULL) {
			fgets(a, i, stdin);
			free(a);
		}
	}
	return 0;
}

In Linux, you can change the stack size by ‘ulimit -s’ command. As a default, it has 10240 (10MB).

[root@localhost ~]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 32768
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32768
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

As you can see, the stack size is ‘10240kbytes’. We can change it with ‘-s’ option. First, without changing anything I will run the above application.

[root@localhost ~]# ./test
PID = 7865

a = 0x97f9e008, size = 512MB

a = 0x77f9e008, size = 1024MB

a = 0x57f9e008, size = 1536MB

a = 0x37f9e008, size = 2048MB
a = 0x17f9e008, size = 2560MB

And the memory layout will be something like this:

[root@localhost ~]# pmap -x 7865
7865:   ./test
Address   Kbytes     RSS    Anon  Locked Mode   Mapping
004c9000     104       -       -       - r-x--  ld-2.5.so
004e3000       4       -       -       - r-x--  ld-2.5.so
004e4000       4       -       -       - rwx--  ld-2.5.so
004e7000    1268       -       -       - r-x--  libc-2.5.so
00624000       8       -       -       - r-x--  libc-2.5.so
00626000       4       -       -       - rwx--  libc-2.5.so
00627000      12       -       -       - rwx--    [ anon ]
00f43000       4       -       -       - r-x--    [ anon ]
08048000       4       -       -       - r-x--  test
08049000       4       -       -       - rw---  test
57f9e000 1572876       -       -       - rw---    [ anon ]
b7fae000       8       -       -       - rw---    [ anon ]
bfaac000      84       -       -       - rw---    [ stack ]
-------- ------- ------- ------- -------
total kB 1574384       -       -       -

You can see the malloc() memory started from the 77fc9000. It looks normal and looks nothing special in this layout. But, you will find something difference soon.

If you change the stack size with something like this:

[root@localhost ~]# ulimit -s unlimited
[root@localhost ~]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 32768
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32768
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

And run the same application again.

[root@localhost ~]# ./test
PID = 7791

a = 0x40012008, size = 512MB

a = 0x40012008, size = 1024MB

a = 0x40012008, size = 1536MB

a = (nil), size = 2048MB
a = (nil), size = 2560MB

It failed when I request 2G of memory which was succeed in the previous test. When I saw the memory layout, it was different.

[root@localhost ~]# pmap -x 7791
7791:   ./test
Address   Kbytes     RSS    Anon  Locked Mode   Mapping
004c9000     104       -       -       - r-x--  ld-2.5.so
004e3000       4       -       -       - r-x--  ld-2.5.so
004e4000       4       -       -       - rwx--  ld-2.5.so
004e7000    1268       -       -       - r-x--  libc-2.5.so
00624000       8       -       -       - r-x--  libc-2.5.so
00626000       4       -       -       - rwx--  libc-2.5.so
00627000      12       -       -       - rwx--    [ anon ]
08048000       4       -       -       - r-x--  test
08049000       4       -       -       - rwx--  test
40000000       4       -       -       - r-x--    [ anon ]
40001000       8       -       -       - rw---    [ anon ]
40010000 1572876       -       -       - rw---    [ anon ]
bf87a000      88       -       -       - rw---    [ stack ]
-------- ------- ------- ------- -------
total kB 1574388       -       -       -

The main difference is that with ‘unlimited’ option, we can see much more allocated memory junk. You can see who use the other memory junks except the one allocated by malloc. As you can see in the following code block, the main reason is ‘[vdso]’. Because it is located at the fixed location, every memory allocation had to start later on. So, possible memory allocation must done after that address. It reduces the available memory sizes.

40000000-40001000 r-xp 40000000 00:00 0          [vdso]
40001000-40003000 rw-p 40001000 00:00 0
40010000-80013000 rw-p 40010000 00:00 0
bffde000-bfff3000 rw-p bffde000 00:00 0          [stack]

The main problem is that ‘[vdso]’ inserted into the middle of the heap when stack size is set to ‘unlimited’. [vdso] is the memory space which is used for fast system call mechanism. With this memory area, we don’t need to call interrupt, just access those memory range. So, it is much faster than interrupt call.

I tried to find out the reason why this location becomes different on different stack size. And I could found it on the arch_pick_mmap_layout(..) function.

  33/*
  34 * Top of mmap area (just below the process stack).
  35 *
  36 * Leave an at least ~128 MB hole.
  37 */
  38#define MIN_GAP (128*1024*1024)
  39#define MAX_GAP (TASK_SIZE/6*5)
  40
  41/*
  42 * True on X86_32 or when emulating IA32 on X86_64
  43 */
  44static int mmap_is_ia32(void)
  45{
  46#ifdef CONFIG_X86_32
  47        return 1;
  48#endif
  49#ifdef CONFIG_IA32_EMULATION
  50        if (test_thread_flag(TIF_IA32))
  51                return 1;
  52#endif
  53        return 0;
  54}
  55
  56static int mmap_is_legacy(void)
  57{
  58        if (current->personality & ADDR_COMPAT_LAYOUT)
  59                return 1;
  60
  61        if (current->signal->rlim[RLIMIT_STACK].rlim_cur == RLIM_INFINITY)
  62                return 1;
  63
  64        return sysctl_legacy_va_layout;
  65}
  66
  67static unsigned long mmap_rnd(void)
  68{
  69        unsigned long rnd = 0;
  70
  71        /*
  72        *  8 bits of randomness in 32bit mmaps, 20 address space bits
  73        * 28 bits of randomness in 64bit mmaps, 40 address space bits
  74        */
  75        if (current->flags & PF_RANDOMIZE) {
  76                if (mmap_is_ia32())
  77                        rnd = (long)get_random_int() % (1<<8);
  78                else
  79                        rnd = (long)(get_random_int() % (1<<28));
  80        }
  81        return rnd <signal->rlim[RLIMIT_STACK].rlim_cur;
  87
  88        if (gap  MAX_GAP)
  91                gap = MAX_GAP;
  92
  93        return PAGE_ALIGN(TASK_SIZE - gap - mmap_rnd());
  94}
  95
  96/*
  97 * Bottom-up (legacy) layout on X86_32 did not support randomization, X86_64
  98 * does, but not when emulating X86_32
  99 */
 100static unsigned long mmap_legacy_base(void)
 101{
 102        if (mmap_is_ia32())
 103                return TASK_UNMAPPED_BASE;
 104        else
 105                return TASK_UNMAPPED_BASE + mmap_rnd();
 106}
 107
 108/*
 109 * This function, called very early during the creation of a new
 110 * process VM image, sets up which VM layout function to use:
 111 */
 112void arch_pick_mmap_layout(struct mm_struct *mm)
 113{
 114        if (mmap_is_legacy()) {
 115                mm->mmap_base = mmap_legacy_base();
 116                mm->get_unmapped_area = arch_get_unmapped_area;
 117                mm->unmap_area = arch_unmap_area;
 118        } else {
 119                mm->mmap_base = mmap_base();
 120                mm->get_unmapped_area = arch_get_unmapped_area_topdown;
 121                mm->unmap_area = arch_unmap_area_topdown;
 122        }
 123}
 124

mmap_is_legacy() will return TRUE if stack size is set to unlimited (or some other reasons which will not explain on here). If mmap_is_leagacy() return TRUE, mm->mmap_base is set to TASK_UNMAPPED_BASE(1GB) in i386 box. This is the start address of heap and vdso just located on here. So, problem(?) happens.

Leave a Comment

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.