If you write a program which requires a big chunk of memory at the time, you will notice the influences of ‘stack size’.
I will show you what will happen in this stack limitation with the following simple program.
#include <stdio.h> #include <malloc.h> int main() { char *a; unsigned long i, half_gb = 512 * 1024 * 1024; unsigned long max = (unsigned long)3 * 1024 * 1024 * 1024; printf("PID = %dnn", getpid()); for (i = half_gb; i < max; i += half_gb) { a = malloc(i); printf("a = %p, size = %uMBn", a, i / (1024 * 1024)); if (a != NULL) { fgets(a, i, stdin); free(a); } } return 0; }
In Linux, you can change the stack size by ‘ulimit -s’ command. As a default, it has 10240 (10MB).
[root@localhost ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 32768 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 32768 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
As you can see, the stack size is ‘10240kbytes’. We can change it with ‘-s’ option. First, without changing anything I will run the above application.
[root@localhost ~]# ./test PID = 7865 a = 0x97f9e008, size = 512MB a = 0x77f9e008, size = 1024MB a = 0x57f9e008, size = 1536MB a = 0x37f9e008, size = 2048MB a = 0x17f9e008, size = 2560MB
And the memory layout will be something like this:
[root@localhost ~]# pmap -x 7865 7865: ./test Address Kbytes RSS Anon Locked Mode Mapping 004c9000 104 - - - r-x-- ld-2.5.so 004e3000 4 - - - r-x-- ld-2.5.so 004e4000 4 - - - rwx-- ld-2.5.so 004e7000 1268 - - - r-x-- libc-2.5.so 00624000 8 - - - r-x-- libc-2.5.so 00626000 4 - - - rwx-- libc-2.5.so 00627000 12 - - - rwx-- [ anon ] 00f43000 4 - - - r-x-- [ anon ] 08048000 4 - - - r-x-- test 08049000 4 - - - rw--- test 57f9e000 1572876 - - - rw--- [ anon ] b7fae000 8 - - - rw--- [ anon ] bfaac000 84 - - - rw--- [ stack ] -------- ------- ------- ------- ------- total kB 1574384 - - -
You can see the malloc() memory started from the 77fc9000. It looks normal and looks nothing special in this layout. But, you will find something difference soon.
If you change the stack size with something like this:
[root@localhost ~]# ulimit -s unlimited [root@localhost ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 32768 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 32768 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
And run the same application again.
[root@localhost ~]# ./test PID = 7791 a = 0x40012008, size = 512MB a = 0x40012008, size = 1024MB a = 0x40012008, size = 1536MB a = (nil), size = 2048MB a = (nil), size = 2560MB
It failed when I request 2G of memory which was succeed in the previous test. When I saw the memory layout, it was different.
[root@localhost ~]# pmap -x 7791 7791: ./test Address Kbytes RSS Anon Locked Mode Mapping 004c9000 104 - - - r-x-- ld-2.5.so 004e3000 4 - - - r-x-- ld-2.5.so 004e4000 4 - - - rwx-- ld-2.5.so 004e7000 1268 - - - r-x-- libc-2.5.so 00624000 8 - - - r-x-- libc-2.5.so 00626000 4 - - - rwx-- libc-2.5.so 00627000 12 - - - rwx-- [ anon ] 08048000 4 - - - r-x-- test 08049000 4 - - - rwx-- test 40000000 4 - - - r-x-- [ anon ] 40001000 8 - - - rw--- [ anon ] 40010000 1572876 - - - rw--- [ anon ] bf87a000 88 - - - rw--- [ stack ] -------- ------- ------- ------- ------- total kB 1574388 - - -
The main difference is that with ‘unlimited’ option, we can see much more allocated memory junk. You can see who use the other memory junks except the one allocated by malloc. As you can see in the following code block, the main reason is ‘[vdso]’. Because it is located at the fixed location, every memory allocation had to start later on. So, possible memory allocation must done after that address. It reduces the available memory sizes.
40000000-40001000 r-xp 40000000 00:00 0 [vdso] 40001000-40003000 rw-p 40001000 00:00 0 40010000-80013000 rw-p 40010000 00:00 0 bffde000-bfff3000 rw-p bffde000 00:00 0 [stack]
The main problem is that ‘[vdso]’ inserted into the middle of the heap when stack size is set to ‘unlimited’. [vdso] is the memory space which is used for fast system call mechanism. With this memory area, we don’t need to call interrupt, just access those memory range. So, it is much faster than interrupt call.
I tried to find out the reason why this location becomes different on different stack size. And I could found it on the arch_pick_mmap_layout(..) function.
33/* 34 * Top of mmap area (just below the process stack). 35 * 36 * Leave an at least ~128 MB hole. 37 */ 38#define MIN_GAP (128*1024*1024) 39#define MAX_GAP (TASK_SIZE/6*5) 40 41/* 42 * True on X86_32 or when emulating IA32 on X86_64 43 */ 44static int mmap_is_ia32(void) 45{ 46#ifdef CONFIG_X86_32 47 return 1; 48#endif 49#ifdef CONFIG_IA32_EMULATION 50 if (test_thread_flag(TIF_IA32)) 51 return 1; 52#endif 53 return 0; 54} 55 56static int mmap_is_legacy(void) 57{ 58 if (current->personality & ADDR_COMPAT_LAYOUT) 59 return 1; 60 61 if (current->signal->rlim[RLIMIT_STACK].rlim_cur == RLIM_INFINITY) 62 return 1; 63 64 return sysctl_legacy_va_layout; 65} 66 67static unsigned long mmap_rnd(void) 68{ 69 unsigned long rnd = 0; 70 71 /* 72 * 8 bits of randomness in 32bit mmaps, 20 address space bits 73 * 28 bits of randomness in 64bit mmaps, 40 address space bits 74 */ 75 if (current->flags & PF_RANDOMIZE) { 76 if (mmap_is_ia32()) 77 rnd = (long)get_random_int() % (1<<8); 78 else 79 rnd = (long)(get_random_int() % (1<<28)); 80 } 81 return rnd <signal->rlim[RLIMIT_STACK].rlim_cur; 87 88 if (gap MAX_GAP) 91 gap = MAX_GAP; 92 93 return PAGE_ALIGN(TASK_SIZE - gap - mmap_rnd()); 94} 95 96/* 97 * Bottom-up (legacy) layout on X86_32 did not support randomization, X86_64 98 * does, but not when emulating X86_32 99 */ 100static unsigned long mmap_legacy_base(void) 101{ 102 if (mmap_is_ia32()) 103 return TASK_UNMAPPED_BASE; 104 else 105 return TASK_UNMAPPED_BASE + mmap_rnd(); 106} 107 108/* 109 * This function, called very early during the creation of a new 110 * process VM image, sets up which VM layout function to use: 111 */ 112void arch_pick_mmap_layout(struct mm_struct *mm) 113{ 114 if (mmap_is_legacy()) { 115 mm->mmap_base = mmap_legacy_base(); 116 mm->get_unmapped_area = arch_get_unmapped_area; 117 mm->unmap_area = arch_unmap_area; 118 } else { 119 mm->mmap_base = mmap_base(); 120 mm->get_unmapped_area = arch_get_unmapped_area_topdown; 121 mm->unmap_area = arch_unmap_area_topdown; 122 } 123} 124
mmap_is_legacy() will return TRUE if stack size is set to unlimited (or some other reasons which will not explain on here). If mmap_is_leagacy() return TRUE, mm->mmap_base is set to TASK_UNMAPPED_BASE(1GB) in i386 box. This is the start address of heap and vdso just located on here. So, problem(?) happens.
Leave a Reply