Sungju's Slow Life

Personal journal


SystemTap to monitor SLAB usage

Checking memory leak in SLAB is not an easy task. It is happening in kernel side and it can go through several different route. Here I want to show how we can use SystemTap to monitor those SLAB alloc/free activities.

As a first step you need to find out the monitoring points. SLAB can be allocated via one of the below functions.

  1. kmem_cache_alloc() : General SLAB allocation with fixed size
  2. kmalloc() : various bytes allocation

Firstly, let’s check ‘kmem_cache_alloc’ which has the below definition.

mm/slab.c : void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags)
mm/slub.c : void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)

However, there’s one problem with this. This kmem_cache_alloc() has different names for the first argument depends on implementation. In mm/slab.c, it is defined as ‘cachep’, but in mm/slub.c, it is ‘s’. You can check which arguments are used by run the below command.

$ # RHEL7
$ stap -L 'kernel.function("kmem_cache_alloc")'
kernel.function("kmem_cache_alloc@mm/slub.c:2621") $s:struct kmem_cache* $gfpflags:gfp_t

$ # RHEL6
$ stap -L 'kernel.function("kmem_cache_alloc")'
kernel.function("kmem_cache_alloc@mm/slab.c:3655") $cachep:struct kmem_cache* $flags:gfp_t

When we are not sure which name is used in arguments or as a variable, we can use ‘@defined()’ to verify if it is defined. Below is the code with that in mind. It is checking if ‘cachep’ variable name is defined. If it is not, it’ll get the value from $s->name instead of $cachep->name. kernel_string() is making string from kernel address.

probe kernel.function("kmem_cache_alloc") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 1)
}

You can ignore ‘check_alloc_or_free()’ line for now. It is basically counting allocation or releasing of SLAB.

The next one is ‘kmalloc()’. Here a problem comes up. The below is ‘kmalloc()’ definition in kernel.

static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
    if (__builtin_constant_p(size)) {
        if (size > KMALLOC_MAX_CACHE_SIZE)
            return kmalloc_large(size, flags);

        if (!(flags & GFP_DMA)) {
            int index = kmalloc_index(size);

            if (!index)
                return ZERO_SIZE_PTR;

            return kmem_cache_alloc_trace(kmalloc_caches[index],
                    flags, size);
        }
    }
    return __kmalloc(size, flags);
}

As the kmalloc slab name will be either ‘size-XXX’ or ‘kmalloc-XXX’, we could use ‘size’ to determine the correct slab name. However, if you try to get variable ‘$size’ in ‘kmalloc()’ monitoring routine, you will get error.

$ stap -e 'probe kernel.function("kmalloc") { }'
WARNING: function kmalloc is in blacklisted section: keyword at <input>:1:1
 source: probe kernel.function("kmalloc") { }
         ^
WARNING: kprobes function kmalloc is blacklisted: keyword at :1:1
 source: probe kernel.function("kmalloc") { }
         ^
WARNING: side-effect-free probe: keyword at :1:1
 source: probe kernel.function("kmalloc") { }
...

So, kmalloc() itself is blacklisted to monitor. Next possible option is using ‘__kmalloc()’, but there are situations that goes without calling __kmalloc() depends on type of size and underlying implementation type such as slab, slob and slub. Here I am going to cover two locations ‘__kmalloc()’ and ‘kmem_cache_alloc_trace()’ which covers most situations.

Still we have another issue when tracking ‘__kmalloc()’. It receives the size instead of slab name. It can be resolved by checking both ‘size-xxx’ and ‘kmalloc-xxx’.

probe kernel.function("__kmalloc") {
	name = sprintf("size-%d", $size)
	check_alloc_or_free(name, 1)
	name = sprintf("kmalloc-%d", $size)
	check_alloc_or_free(name, 1)
}
probe kernel.function("kmem_cache_alloc_trace") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 1)
}

So, checking the allocation part is covered mostly. Now it is time to monitor SLAB releasing part. SLAB release can be happening in two locations – kmem_cache_free() and kfree().

kmem_cache_free() is quite easy as it takes kmem_cache as first argument.

$ stap -L 'kernel.function("kmem_cache_free")'
 kernel.function("kmem_cache_free@mm/slub.c:2849") $s:struct kmem_cache* $x:void*

So, the script part will be like below.

probe kernel.function("kmem_cache_free") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 2)
}

However, ‘kfree()’ is quite tricky as it takes memory address as the argument.

$ stap -L 'kernel.function("kfree")'
 kernel.function("kfree@mm/slub.c:3762") $x:void const*

Unfortunately, it is time we have to change this script to ‘guru’ mode (-g) which we can use C code in the script. Upside of this guru mode is you can do whatever you want to do in kernel as you are basically writing module code in C. Downside is that it can crash the system or corrupting any part of system. Any C code can be located in between ” %{” and “%}”.

Let’s putting some kernel headers at the beginning of the script as we need to access memory related kernel functions.

%{
#include <linux/mm.h>
#include <linux/mmzone.h>
#include <linux/bootmem.h>
#include <linux/bit_spinlock.h>
#include <linux/page_cgroup.h>
#include <linux/hash.h>
#include <linux/slab.h>
#include <linux/memory.h>
#include <linux/vmalloc.h>
#include <linux/cgroup.h>
#include <linux/swapops.h>
%}

The below is the script for ‘kfree’ monitoring. Here it calls get_slab_page() to get page structure from address pointer.

probe kernel.function("kfree") {
	/*
	 * #define ZERO_SIZE_PTR ((void *)16)
	 *
	 * #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
	 *                 (unsigned long)ZERO_SIZE_PTR)
	 *
	 */
	if (@defined($objp) || @defined($block)) {
		/* Ignore as it will be handled in '__cache_free' */
	} else {
		addr = $x
		if (addr > 16 || addr < 0) {
      		page = get_slab_page(addr)
      		if (page != 0) {
				if (@defined(@cast(page, "struct page")->slab)) {
				name = @cast(page, "struct page")->slab->name
				} else {
				name = @cast(page, "struct page")->slab_cache->name
				}
				if (name != 0) {
					check_alloc_or_free(kernel_string(name), 2)
				}
			}
		}
	}
}

get_slab_page() needs to use kernel function to find page for the address.

function get_slab_page:long (x:long) %{
	struct page *page;

	page = (void *)virt_to_head_page((const void *) STAP_ARG_x);
  	STAP_RETVALUE = (long)page;
%}

Full source code include check_alloc_or_free() is shown in the below.

#This script displays the number of given slab allocations and the backtraces leading up to it.
#The original source was from the below KCS, however I had to change it to
#monitor for a specified counts instead of 10 seconds.
#
#Is there a way to track slab allocations or leaks with systemtap?
#https:				//access.redhat.com/articles/2850581
#
#It runs for a specified times
#Usage:
#stap -v --all-modules kmem_alloc_free.stp dentry -o stap_result.txt 2

%{
#include <linux/mm.h>
#include <linux/mmzone.h>
#include <linux/bootmem.h>
#include <linux/bit_spinlock.h>
#include <linux/page_cgroup.h>
#include <linux/hash.h>
#include <linux/slab.h>
#include <linux/memory.h>
#include <linux/vmalloc.h>
#include <linux/cgroup.h>
#include <linux/swapops.h>

%}

global slab = @1
global total_alloc_count = 0
global total_free_count = 0
global max_alloc_limit = $2
global stats, stacks

probe begin {
	printf("Start to monitor \"%s\" alloc calls for %d times\n",
		slab, max_alloc_limit)
	printf("-------------------------------------------------------\n\n")
}

probe kernel.function("kmem_cache_alloc") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 1)
}

probe kernel.function("__kmalloc") {
	name = sprintf("size-%d", $size)
	check_alloc_or_free(name, 1)
	name = sprintf("kmalloc-%d", $size)
	check_alloc_or_free(name, 1)
}
probe kernel.function("kmem_cache_alloc_trace") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 1)
}

probe kernel.function("kmem_cache_free") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 2)
}

/*
probe kernel.function("__cache_free") {
	name = kernel_string(@defined($cachep) ? $cachep->name : $s->name)
	check_alloc_or_free(name, 2)
}
*/

probe kernel.function("kfree") {
	/*
	 * #define ZERO_SIZE_PTR ((void *)16)
	 *
	 * #define ZERO_OR_NULL_PTR(x) ((unsigned long)(x) <= \
	 *                 (unsigned long)ZERO_SIZE_PTR)
	 *
	 */
	if (@defined($objp) || @defined($block)) {
		/* Ignore as it will be handled in '__cache_free' */
	} else {
		addr = $x
		if (addr > 16 || addr < 0) {
      		page = get_slab_page(addr)
      		if (page != 0) {
				if (@defined(@cast(page, "struct page")->slab)) {
				name = @cast(page, "struct page")->slab->name
				} else {
				name = @cast(page, "struct page")->slab_cache->name
				}
				if (name != 0) {
					check_alloc_or_free(kernel_string(name), 2)
				}
			}
		}
	}
}

function get_slab_page:long (x:long) %{
	struct page *page;

	page = (void *)virt_to_head_page((const void *) STAP_ARG_x);
  	STAP_RETVALUE = (long)page;
%}

function check_alloc_or_free(name:string, alloc_mode:long) {
	if (name == slab) {
		stats[alloc_mode, execname()] <<< 1
		stacks[alloc_mode, execname(), name, backtrace()] <<< 1

		if (alloc_mode == 1) {
			total_alloc_count = total_alloc_count + 1
			if (total_alloc_count >= max_alloc_limit) {
				exit()
			}
		} else if (alloc_mode == 2) {
			total_free_count = total_free_count + 1
		}
	}
}

probe end {
	printf("Total alloc : %d times.\n", total_alloc_count)
	printf("Total free  : %d times.\n", total_free_count)
	printf("\nNumber of %s slab allocations/freeing per process\n", slab)
	foreach ([mode, exec] in stats-) {
		printf("%s : %s:\t%d\n", mode == 1 ? "alloc" : "free",
				     exec, @count(stats[mode, exec]))
	}
	printf("\nBacktrace of processes when allocating\n")
	foreach ([mode, proc, cache, bt] in stacks-) {
		printf("Mode: %s Exec: %s Name: %s Count: %d\n",
			mode == 1 ? "alloc" : "free", proc, cache,
			@count(stacks[mode, proc, cache, bt]))
		print_stack(bt)
		printf("-------------------------------------------------------\n\n")
	}
}

Now, you can run it like below which will monitor size-32 or kmalloc-32 until allocation happens 100 times.

$ stap -g --suppress-time-limits -v --all-modules -o stap_result.txt kmem_alloc_free.stp size-32 100

Once you have a working script, just run and get the text output will do the job. However, sometimes, you may want to see what is happening in this script code in kernel. Especially the script is causing system crash, you may want to find out the location of the crash.

Let’s say you want to check ‘stapkp_relocate_addr’ which is located in the loaded stap module.

crash> emodinfo -t
struct module *    MODULE_NAME                     SIZE 
0xffffffffc0819ae0 overlay                        91659 
0xffffffffc1113540 stap_94ab28b8e29c63a7eadf5f4953f0b86e_11377    8688714 
===========================================================================
There are 2 tainted modules, tainted_mask = 0x20003000 (Tainted: GOET)

crash> emodinfo --details=stap_94ab28b8e29c63a7eadf5f4953f0b86e_11377 | grep get_sleb128
0xffffffffc0903100 (t) get_sleb128

crash> dis -l get_sleb128
0xffffffffc0903100 <get_sleb128>:	nopl   0x0(%rax,%rax,1) [FTRACE NOP]
0xffffffffc0903105 <get_sleb128+5>:	mov    (%rdi),%rax
0xffffffffc0903108 <get_sleb128+8>:	cmp    %rsi,%rax
0xffffffffc090310b <get_sleb128+11>:	jae    0xffffffffc090319a <get_sleb128+154>
0xffffffffc0903111 <get_sleb128+17>:	push   %rbp
0xffffffffc0903112 <get_sleb128+18>:	lea    0x1(%rax),%r8
0xffffffffc0903116 <get_sleb128+22>:	xor    %ecx,%ecx
0xffffffffc0903118 <get_sleb128+24>:	mov    $0x7,%r10d
0xffffffffc090311e <get_sleb128+30>:	mov    %rsp,%rbp
0xffffffffc0903121 <get_sleb128+33>:	push   %rbx
0xffffffffc0903122 <get_sleb128+34>:	movzbl (%rax),%r9d
0xffffffffc0903126 <get_sleb128+38>:	mov    $0x40,%ebx
0xffffffffc090312b <get_sleb128+43>:	xor    %eax,%eax
0xffffffffc090312d <get_sleb128+45>:	jmp    0xffffffffc0903161 <get_sleb128+97>
0xffffffffc090312f <get_sleb128+47>:	nop
0xffffffffc0903130 <get_sleb128+48>:	cmp    %rsi,%r8
0xffffffffc0903133 <get_sleb128+51>:	je     0xffffffffc0903186 <get_sleb128+134>
0xffffffffc0903135 <get_sleb128+53>:	lea    0x7(%r10),%edx
0xffffffffc0903139 <get_sleb128+57>:	add    $0x1,%r8
0xffffffffc090313d <get_sleb128+61>:	movzbl -0x1(%r8),%r9d
0xffffffffc0903142 <get_sleb128+66>:	cmp    $0x40,%edx
0xffffffffc0903145 <get_sleb128+69>:	jbe    0xffffffffc090315b <get_sleb128+91>
0xffffffffc0903147 <get_sleb128+71>:	mov    %r9d,%r11d
0xffffffffc090314a <get_sleb128+74>:	mov    %ebx,%ecx
0xffffffffc090314c <get_sleb128+76>:	and    $0x7f,%r11d
0xffffffffc0903150 <get_sleb128+80>:	sub    %r10d,%ecx
0xffffffffc0903153 <get_sleb128+83>:	shr    %cl,%r11d
0xffffffffc0903156 <get_sleb128+86>:	test   %r11d,%r11d
0xffffffffc0903159 <get_sleb128+89>:	jne    0xffffffffc0903190 <get_sleb128+144>
0xffffffffc090315b <get_sleb128+91>:	mov    %r10d,%ecx
0xffffffffc090315e <get_sleb128+94>:	mov    %edx,%r10d
0xffffffffc0903161 <get_sleb128+97>:	mov    %r9,%rdx
0xffffffffc0903164 <get_sleb128+100>:	and    $0x7f,%edx
0xffffffffc0903167 <get_sleb128+103>:	shl    %cl,%rdx
0xffffffffc090316a <get_sleb128+106>:	or     %rdx,%rax
0xffffffffc090316d <get_sleb128+109>:	test   %r9b,%r9b
0xffffffffc0903170 <get_sleb128+112>:	js     0xffffffffc0903130 <get_sleb128+48>
0xffffffffc0903172 <get_sleb128+114>:	and    $0x40,%r9d
0xffffffffc0903176 <get_sleb128+118>:	movzbl %r9b,%r9d
0xffffffffc090317a <get_sleb128+122>:	neg    %r9d
0xffffffffc090317d <get_sleb128+125>:	shl    %cl,%r9d
0xffffffffc0903180 <get_sleb128+128>:	movslq %r9d,%r9
0xffffffffc0903183 <get_sleb128+131>:	or     %r9,%rax
0xffffffffc0903186 <get_sleb128+134>:	pop    %rbx
0xffffffffc0903187 <get_sleb128+135>:	mov    %r8,(%rdi)
0xffffffffc090318a <get_sleb128+138>:	pop    %rbp
0xffffffffc090318b <get_sleb128+139>:	retq   
0xffffffffc090318c <get_sleb128+140>:	nopl   0x0(%rax)
0xffffffffc0903190 <get_sleb128+144>:	lea    0x1(%rsi),%r8
0xffffffffc0903194 <get_sleb128+148>:	pop    %rbx
0xffffffffc0903195 <get_sleb128+149>:	mov    %r8,(%rdi)
0xffffffffc0903198 <get_sleb128+152>:	pop    %rbp
0xffffffffc0903199 <get_sleb128+153>:	retq   
0xffffffffc090319a <get_sleb128+154>:	mov    %rax,%r8
0xffffffffc090319d <get_sleb128+157>:	xor    %eax,%eax
0xffffffffc090319f <get_sleb128+159>:	mov    %r8,(%rdi)
0xffffffffc09031a2 <get_sleb128+162>:	retq     

Even though it has ‘-l’ option in the above example, it doesn’t show source code details at all. It is because we don’t have debug info for this module. Run ‘mod -S’ doesn’t make any difference as it was compiled without debug info at all from the beginning.

To make it has debug info, we should build the module from the script to use it. You can run with ‘-B CONFIG_DEBUG_INFO=y’ which forces to include debug info in the module. Also, as we need to use this later in crash or objdump, it should be exist even after running the script. So, we should use ‘-m’ with a module name such as ‘-m kmem_alloc_free.ko’. Below shows that the module is created with the specified name.

$ stap -g --suppress-time-limits -v --all-modules -o stap_result.txt -B CONFIG_DEBUG_INFO=y -m kmem_alloc_free.ko kmem_alloc_free.stp size-32 100

$ ls -l *.ko
-rw-r--r--. 1 root 13M Oct 30 16:11 kmem_alloc_free.ko

crash> mod -t
NAME             TAINTS
overlay          T
kmem_alloc_free  OE

crash> mod -s kmem_alloc_free /root/stap/kmem_alloc_free.ko
     MODULE       NAME                             SIZE  OBJECT FILE
ffffffffc195e540  kmem_alloc_free               8688776  /root/stap/kmem_alloc_free.ko 
crash> dis -l get_sleb128
/usr/share/systemtap/runtime/unwind/unwind.h: 86
0xffffffffc0903100 <get_sleb128>:	nopl   0x0(%rax,%rax,1) [FTRACE NOP]
/usr/share/systemtap/runtime/unwind/unwind.h: 87
0xffffffffc0903105 <get_sleb128+5>:	mov    (%rdi),%rax
/usr/share/systemtap/runtime/unwind/unwind.h: 91
0xffffffffc0903108 <get_sleb128+8>:	cmp    %rsi,%rax
0xffffffffc090310b <get_sleb128+11>:	jae    0xffffffffc090319a <get_sleb128+154>
/usr/share/systemtap/runtime/unwind/unwind.h: 86
0xffffffffc0903111 <get_sleb128+17>:	push   %rbp
/usr/share/systemtap/runtime/unwind/unwind.h: 92
0xffffffffc0903112 <get_sleb128+18>:	lea    0x1(%rax),%r8
/usr/share/systemtap/runtime/unwind/unwind.h: 91
0xffffffffc0903116 <get_sleb128+22>:	xor    %ecx,%ecx
/usr/share/systemtap/runtime/unwind/unwind.h: 93
0xffffffffc0903118 <get_sleb128+24>:	mov    $0x7,%r10d
/usr/share/systemtap/runtime/unwind/unwind.h: 86
0xffffffffc090311e <get_sleb128+30>:	mov    %rsp,%rbp
0xffffffffc0903121 <get_sleb128+33>:	push   %rbx
/usr/share/systemtap/runtime/unwind/unwind.h: 92
0xffffffffc0903122 <get_sleb128+34>:	movzbl (%rax),%r9d
0xffffffffc0903126 <get_sleb128+38>:	mov    $0x40,%ebx
/usr/share/systemtap/runtime/unwind/unwind.h: 88
0xffffffffc090312b <get_sleb128+43>:	xor    %eax,%eax
0xffffffffc090312d <get_sleb128+45>:	jmp    0xffffffffc0903161 <get_sleb128+97>
0xffffffffc090312f <get_sleb128+47>:	nop
/usr/share/systemtap/runtime/unwind/unwind.h: 91
0xffffffffc0903130 <get_sleb128+48>:	cmp    %rsi,%r8
0xffffffffc0903133 <get_sleb128+51>:	je     0xffffffffc0903186 <get_sleb128+134>
/usr/share/systemtap/runtime/unwind/unwind.h: 93
0xffffffffc0903135 <get_sleb128+53>:	lea    0x7(%r10),%edx
/usr/share/systemtap/runtime/unwind/unwind.h: 92
0xffffffffc0903139 <get_sleb128+57>:	add    $0x1,%r8
0xffffffffc090313d <get_sleb128+61>:	movzbl -0x1(%r8),%r9d
/usr/share/systemtap/runtime/unwind/unwind.h: 93
0xffffffffc0903142 <get_sleb128+66>:	cmp    $0x40,%edx
0xffffffffc0903145 <get_sleb128+69>:	jbe    0xffffffffc090315b <get_sleb128+91>
/usr/share/systemtap/runtime/unwind/unwind.h: 94
0xffffffffc0903147 <get_sleb128+71>:	mov    %r9d,%r11d
0xffffffffc090314a <get_sleb128+74>:	mov    %ebx,%ecx
0xffffffffc090314c <get_sleb128+76>:	and    $0x7f,%r11d
0xffffffffc0903150 <get_sleb128+80>:	sub    %r10d,%ecx
0xffffffffc0903153 <get_sleb128+83>:	shr    %cl,%r11d
0xffffffffc0903156 <get_sleb128+86>:	test   %r11d,%r11d
0xffffffffc0903159 <get_sleb128+89>:	jne    0xffffffffc0903190 <get_sleb128+144>
/usr/share/systemtap/runtime/unwind/unwind.h: 86
0xffffffffc090315b <get_sleb128+91>:	mov    %r10d,%ecx
/usr/share/systemtap/runtime/unwind/unwind.h: 93
0xffffffffc090315e <get_sleb128+94>:	mov    %edx,%r10d
/usr/share/systemtap/runtime/unwind/unwind.h: 98
0xffffffffc0903161 <get_sleb128+97>:	mov    %r9,%rdx
0xffffffffc0903164 <get_sleb128+100>:	and    $0x7f,%edx
0xffffffffc0903167 <get_sleb128+103>:	shl    %cl,%rdx
0xffffffffc090316a <get_sleb128+106>:	or     %rdx,%rax
/usr/share/systemtap/runtime/unwind/unwind.h: 99
0xffffffffc090316d <get_sleb128+109>:	test   %r9b,%r9b
0xffffffffc0903170 <get_sleb128+112>:	js     0xffffffffc0903130 <get_sleb128+48>
/usr/share/systemtap/runtime/unwind/unwind.h: 100
0xffffffffc0903172 <get_sleb128+114>:	and    $0x40,%r9d
0xffffffffc0903176 <get_sleb128+118>:	movzbl %r9b,%r9d
0xffffffffc090317a <get_sleb128+122>:	neg    %r9d
0xffffffffc090317d <get_sleb128+125>:	shl    %cl,%r9d
0xffffffffc0903180 <get_sleb128+128>:	movslq %r9d,%r9
0xffffffffc0903183 <get_sleb128+131>:	or     %r9,%rax
/usr/share/systemtap/runtime/unwind/unwind.h: 107
0xffffffffc0903186 <get_sleb128+134>:	pop    %rbx
/usr/share/systemtap/runtime/unwind/unwind.h: 104
0xffffffffc0903187 <get_sleb128+135>:	mov    %r8,(%rdi)
/usr/share/systemtap/runtime/unwind/unwind.h: 107
0xffffffffc090318a <get_sleb128+138>:	pop    %rbp
0xffffffffc090318b <get_sleb128+139>:	retq   
0xffffffffc090318c <get_sleb128+140>:	nopl   0x0(%rax)
/usr/share/systemtap/runtime/unwind/unwind.h: 95
0xffffffffc0903190 <get_sleb128+144>:	lea    0x1(%rsi),%r8
/usr/share/systemtap/runtime/unwind/unwind.h: 107
0xffffffffc0903194 <get_sleb128+148>:	pop    %rbx
/usr/share/systemtap/runtime/unwind/unwind.h: 104
0xffffffffc0903195 <get_sleb128+149>:	mov    %r8,(%rdi)
/usr/share/systemtap/runtime/unwind/unwind.h: 107
0xffffffffc0903198 <get_sleb128+152>:	pop    %rbp
0xffffffffc0903199 <get_sleb128+153>:	retq   
/usr/share/systemtap/runtime/unwind/unwind.h: 91
0xffffffffc090319a <get_sleb128+154>:	mov    %rax,%r8
/usr/share/systemtap/runtime/unwind/unwind.h: 88
0xffffffffc090319d <get_sleb128+157>:	xor    %eax,%eax
/usr/share/systemtap/runtime/unwind/unwind.h: 104
0xffffffffc090319f <get_sleb128+159>:	mov    %r8,(%rdi)
/usr/share/systemtap/runtime/unwind/unwind.h: 107
0xffffffffc09031a2 <get_sleb128+162>:	retq   

Above shows the source code location, but doesn’t show the actual source code. This problem can be resolved by using objdump from command line.

$ objdump -S kmem_alloc_free.ko

kmem_alloc_free.ko:     file format elf64-x86-64


Disassembly of section .text:

...
static sleb128_t get_sleb128(const u8 **pcur, const u8 *end)
{
     130:       49 39 f0                cmp    %rsi,%r8
     133:       74 51                   je     186 <get_sleb128+0x86>
        const u8 *cur = *pcur;
     135:       41 8d 52 07             lea    0x7(%r10),%edx
        sleb128_t value = 0;
        unsigned shift;

        for (shift = 0; cur < end; shift += 7) {
     139:       49 83 c0 01             add    $0x1,%r8
     13d:       45 0f b6 48 ff          movzbl -0x1(%r8),%r9d
    const u8 cur_val = *cur++;
     142:       83 fa 40                cmp    $0x40,%edx
     145:       76 14                   jbe    15b <get_sleb128+0x5b>
        for (shift = 0; cur < end; shift += 7) {
     147:       45 89 cb                mov    %r9d,%r11d
                if (shift + 7 > 8 * sizeof(value)
     14a:       89 d9                   mov    %ebx,%ecx
     14c:       41 83 e3 7f             and    $0x7f,%r11d
{
     150:       44 29 d1                sub    %r10d,%ecx
    const u8 cur_val = *cur++;
     153:       41 d3 eb                shr    %cl,%r11d
     156:       45 85 db                test   %r11d,%r11d
     159:       75 35                   jne    190 <get_sleb128+0x90>
        sleb128_t value = 0;
     15b:       44 89 d1                mov    %r10d,%ecx
     15e:       41 89 d2                mov    %edx,%r10d
        for (shift = 0; cur < end; shift += 7) {
     161:       4c 89 ca                mov    %r9,%rdx
...
{
     18b:       c3                      retq   
     18c:       0f 1f 40 00             nopl   0x0(%rax)
                if (shift + 7 > 8 * sizeof(value)
     190:       4c 8d 46 01             lea    0x1(%rsi),%r8
                        cur = end + 1;
                        break;
                }
                value |= (sleb128_t)(cur_val & 0x7f) << shift;
     194:       5b                      pop    %rbx
     195:       4c 89 07                mov    %r8,(%rdi)
     198:       5d                      pop    %rbp
     199:       c3                      retq   
     19a:       49 89 c0                mov    %rax,%r8
                if (!(cur_val & 0x80)) {
     19d:       31 c0                   xor    %eax,%eax
     19f:       4c 89 07                mov    %r8,(%rdi)
                        value |= -(cur_val & 0x40) << shift;
     1a2:       c3                      retq   
     1a3:       0f 1f 00                nopl   (%rax)
     1a6:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
     1ad:       00 00 00 
...



Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.