How to check page caches in the kernel memory.

If you would like to know what files were occupying the page caches which you can see with 'free' in command line or 'kmem -I' in kernel memory, you can start with '/proc/sys/vm/drop_caches'. int drop_caches_sysctl_handler(ctl_table *table, int write, void __user *buffer, size_t *length, loff_t *ppos) { int ret; ret = proc_dointvec_minmax(table, write, buffer, length, ppos);… Continue reading How to check page caches in the kernel memory.

How intel_idle works

When the system is in IDLE state which means nothing to run and swapper is running, it calls cpuidle_idle_call() like shown in the below. [exception RIP: cpuidle_enter_state+0x57] RIP: ffffffff833c1a67 RSP: ffff8917ce84be60 RFLAGS: 00000202 RAX: 0001332321398719 RBX: ffffffff82eaf37b RCX: 0000000000000018 RDX: 0000000225c17d03 RSI: ffff8917ce84bfd8 RDI: 0001332321398719 RBP: ffff8917ce84be88 R8: 0000000000000130 R9: 0000000000000018 R10: 00000000000000c3 R11: 0000000000000400… Continue reading How intel_idle works

How to write mpykdump extension

If you are dealing with a vmcore (Linux memory dump), you must be familiar with 'crash'. It is a powerful tool, but it doesn't cover all the data you can find in Linux kernel. So, there comes 'mpykdump' which is a crash extension which understands python code. mpykdump comes with many prebuilt commands that you… Continue reading How to write mpykdump extension

Where’s PageSlab() macro???

If you are having hard time to find the definition of PageSlab() in linux kernel, here's the answer. In include/linux/page-flags.h: #define __PAGEFLAG(uname, lname) TESTPAGEFLAG(uname, lname) \ __SETPAGEFLAG(uname, lname) __CLEARPAGEFLAG(uname, lname) #define TESTPAGEFLAG(uname, lname) \ static inline int Page##uname(const struct page *page) \ { return test_bit(PG_##lname, &page->flags); } #define __SETPAGEFLAG(uname, lname) \ static inline void __SetPage##uname(struct… Continue reading Where’s PageSlab() macro???

What is ‘page_cache’, how it is managed and how ‘drop_caches’ dropping this pages?

- The "buffers/cache" values reported by free include the page cache, but not the dentry cache which is saved in slab 'dentry_cache'. - page cache is increased and decreased based on the disk access activities and managed by each super block (it means each disk). - 'echo 1 > /proc/sys/vm/drop_caches' frees page caches by calling… Continue reading What is ‘page_cache’, how it is managed and how ‘drop_caches’ dropping this pages?

How to calculate available memory for mem_cgroup.

mem_cgroup is checking available amount of memory the group can charge by calling the below function. /** * mem_cgroup_margin - calculate chargeable space of a memory cgroup * @memcg: the memory cgroup * * Returns the maximum amount of memory @mem can be charged with, in * pages. */ static unsigned long mem_cgroup_margin(struct mem_cgroup *memcg)… Continue reading How to calculate available memory for mem_cgroup.

What happens if you try two commands ‘ethtool -p ‘ and ‘ethtool ‘ in parallel.

If you start 'ethtool -p ' and also start 'ethtool ' after that, you may see the delays in 'ethtool ' command. It is because any ethtool commands start by taking 'rtnl_lock' and 'ethtool -p' is keep running for LED on/off. In the below, bnx2x's identity function just turns on or off the led. get_settings()… Continue reading What happens if you try two commands ‘ethtool -p ‘ and ‘ethtool ‘ in parallel.

Print callgraph of a function

Sometimes you may want to see what functions are called in a function in multiple level. Below command in my extension may help. crash> edis -c irq_exit {irq_exit} -+- {rcu_irq_exit} -+- {warn_slowpath_null} |- {idle_cpu} |- {tick_nohz_stop_sched_tick} -+- {ktime_get} | |- {update_ts_time_stats} | |- {sched_clock_idle_sleep_event} | |- {rcu_needs_cpu} | |- {select_nohz_load_balancer} | |- {rcu_enter_nohz} | |-… Continue reading Print callgraph of a function

Why error message not goes into pipe nor redirected path in ‘crash’?

In the below example, the error always shows in the console. crash> sym ffffffffa02ef86 > /dev/null sym: invalid address: ffffffffa02ef86 This 'sym' command is implemented in 'void cmd_sym(void)' function in crash. /* * This command may be used to: * * 1. Translate a symbol to its value. * 2. Translate a value to it… Continue reading Why error message not goes into pipe nor redirected path in ‘crash’?

What happens if numa=off is provided in kernel parameter?

If "numa=off" is in kernel boot parameter, it will mark 'numa_off' global variable which will be checked during initialization function which is 'x86_numa_init()' in x86_64. This will make it not call 'numa_init' if numa_off is 1. static __init int numa_setup(char *opt) { if (!opt) return -EINVAL; if (!strncmp(opt, "off", 3)) numa_off = 1; #ifdef CONFIG_NUMA_EMU… Continue reading What happens if numa=off is provided in kernel parameter?

An example case with some of my commands

System got high load average and it wasn't responding for long which is a typical hang situation. crash> sys | egrep -e LOAD -e CPUS CPUS: 14 LOAD AVERAGE: 520.69, 210.35, 79.69 crash> hangcheck [0 00:00:00.003] [UN] PID: 5507 TASK: ffff8d257723cf10 CPU: 6 COMMAND: "ora_dia0_gladp6" [0 00:00:00.006] [UN] PID: 6068 TASK: ffff8d266239cf10 CPU: 7 COMMAND:… Continue reading An example case with some of my commands