Sungju's Slow Life

Personal journal


What’s virtual address limit of 32bit/64bit Linux kernel?

  • RHEL 5 code
  • 32bit: include/asm-i386/processor.h
/*
 * User space process size: 3GB (default).
 */
#define TASK_SIZE (PAGE_OFFSET)
  • 64bit: include/asm-x86_64/processor.h
/*
 * User space process size. 47bits minus one guard page.
 */
#define TASK_SIZE64 (0x800000000000UL - 4096)

/* This decides where the kernel will search for a free chunk of vm
 * space during mmap's.
 */
#define IA32_PAGE_OFFSET ((current->personality & ADDR_LIMIT_3GB) ? 0xc0000000 : 0xFFFFe000)

#define TASK_SIZE     (test_thread_flag(TIF_IA32) ? IA32_PAGE_OFFSET : TASK_SIZE64)
#define TASK_SIZE_OF(child)   ((test_tsk_thread_flag(child, TIF_IA32)) ? IA32_PAGE_OFFSET : TASK_SIZE64)
  • RHEL 6 code
#ifdef CONFIG_X86_32
/*
 * User space process size: 3GB (default).
 */
#define TASK_SIZE   PAGE_OFFSET
#define TASK_SIZE_MAX   TASK_SIZE

#else
/*
 * User space process size. 47bits minus one guard page.
 */
#define TASK_SIZE_MAX ((1UL <personality & ADDR_LIMIT_3GB) ? 
          0xc0000000 : 0xFFFFe000)
          
#define TASK_SIZE   (test_thread_flag(TIF_IA32) ? 
          IA32_PAGE_OFFSET : TASK_SIZE_MAX)
  • RHEL 7
#ifdef CONFIG_X86_32
/*
 * User space process size: 3GB (default).
 */
#define TASK_SIZE   PAGE_OFFSET
#define TASK_SIZE_MAX   TASK_SIZE

#else
/*
 * User space process size. 47bits minus one guard page.
 */
#define TASK_SIZE_MAX ((1UL <personality & ADDR_LIMIT_3GB) ? 
          0xc0000000 : 0xFFFFe000)

#define TASK_SIZE   (test_thread_flag(TIF_ADDR32) ? 
          IA32_PAGE_OFFSET : TASK_SIZE_MAX)
  • User space limit in 32 bit system is 3GB which is same as PAGE_OFFSET (0xC0000000)
  • User space limit in 64 bit system can be calculated from the below. It’s 128 TiB – 4KB
(1UL << 47) == 140737488355328 == 128TiB
(0x800000000000UL - 4096) == 140737488351232 == 0x7ffffffff000
  • The kernel space limit is same as user space limit which is well described in ‘Documentation/x86/x86_64/mm.txt’.
Virtual memory map with 4 level page tables:

0000000000000000 - 00007fffffffffff (=47 bits) user space, different per mm
hole caused by [48:63] sign extension
ffff800000000000 - ffff80ffffffffff (=40 bits) guard hole
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffffff80000000 - ffffffffa0000000 (=512 MB)  kernel text mapping, from phys 0
ffffffffa0000000 - fffffffffff00000 (=1536 MB) module mapping space

The direct mapping covers all memory in the system up to the highest
memory address (this means in some cases it can also include PCI memory
holes).

vmalloc space is lazily synchronized into the different PML4 pages of
the processes using the page fault handler, with init_level4_pgt as
reference.

Current X86-64 implementations only support 40 bits of address space,
but we support up to 46 bits. This expands into MBZ space in the page tables.
  • Actual limit can be find in the 4 level page tables which confirms that the system uses 46 bits.
/*
 * PGDIR_SHIFT determines what a top-level page table entry can map
 */
#define PGDIR_SHIFT 39
#define PTRS_PER_PGD  512

/*
 * 3rd level page
 */
#define PUD_SHIFT 30
#define PTRS_PER_PUD  512

/*
 * PMD_SHIFT determines the size of the area a middle-level
 * page table can map
 */
#define PMD_SHIFT 21
#define PTRS_PER_PMD  512

/*
 * entries per page directory level
 */
#define PTRS_PER_PTE  512

#define PMD_SIZE  (_AC(1, UL) << PMD_SHIFT)
#define PMD_MASK  (~(PMD_SIZE - 1))
#define PUD_SIZE  (_AC(1, UL) << PUD_SHIFT)
#define PUD_MASK  (~(PUD_SIZE - 1))
#define PGDIR_SIZE  (_AC(1, UL) << PGDIR_SHIFT)
#define PGDIR_MASK  (~(PGDIR_SIZE - 1))
  • In 46 bits, if we subtract user space range, it gives us the same amount for kernel space
(0xfffffff00000-0x800000000000) == 0x7ffffff00000 == 128 TiB

IMG 7259



Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

About Me

A software engineer who loves any technologies that makes life easier. That’s why I love Linux and Mac at the same time.

Newsletter

%d bloggers like this: