Linux Kernel Module Rootkit — Syscall Table Hijacking
In this article, we will talk about system calls hijacking using the Linux kernel syscall table. I’ll present you how to get the syscall table’s address in two different methods and how to apply them to make a system call to do (almost) everything you want (hooking).
LKM Recap
I presume that if you got until here it means that you already have decent knowledge in what LKM (Loadable Kernel Module) is, so if you do, skip to the next part if you would like.
LKM is an object file that can be inserted into a running kernel. This is largely used for expanding the kernel’s functionality (device drivers, filesystems, etc.). Another use of it is creating a rootkit that will operate from inside the kernel.
Protection rings
There are two modes in an operating system, user-mode and kernel-mode, which are defined by the protection rings. “Protection Rings” is the hierarchy architecture of privileges in a system. There are four rings (ring-0, ring-1, etc.) and the more you go down, the more privilege you are. In most of the modern systems, there is actually only two of those rings, ring-0, which also known as the kernel-mode, and ring-3, which is the user-mode.
A process that runs in the kernel-mode, has access to all the system resources, including the most “sensitive” parts, which may cause a system crash when kernel-mode process access to the wrong resource or if it is just crashes from one reason or another. However, user-mode processes have very limited resources access, so it won’t access any sensitive resource. that was designed that way to protect (hence the name, protection rings) the user from making big mistakes.
Rootkits
From Wikipedia:
A rootkit is a collection of computer software, typically malicious, designed to enable access to a computer or an area of its software that is not otherwise allowed (for example, to an unauthorised user) and often masks its existence or the existence of other software.
When speaking about rootkits, we should differ it into two different modes, kernel-mode rootkit, that runs with kernel privileges, and user-mode rootkit, that runs with user privileges. A user-mode rootkit can change binaries like ls
, ss
, cat
etc. A user-mode rootkit may also hook dynamically linked libraries to change the behaviour of certain functions.
A kernel-mode rootkit, however, can do much more thanks to the privileges it has, things like changing kernel level function pointers, changing kernel code, manipulate important data structures and most important, hooking system calls.
System Calls
The kernel acts as the connection between the user and the machine, so every time the user needs to do something on the machine, it “talks” to the kernel and asks it to pass the message to the machine. This “talking” is being possible thank to system calls.
A system call is a function in the kernel that is also visible to the user. When a user needs a service from the kernel, it asks the kernel to execute a system call. For example, the cat
command in Linux uses the system calls open()
to open the file, read()
to read the file, write()
to print the information on the screen and close()
to close the opened file (It also uses a few more system calls that I didn’t mention here).
Those system calls are only executed in a kernel context because they need to access some parts that only the kernel can (protection rings, remember?).
Why is it important to us?
Imagine you could change the read()
system call in a way that every time a user will try to read a stream of bytes, it will only read the bytes you want it to read, that way you could hide secret data in every file and the user won’t even know about it.
The options are only limited to the mind, that is because all the user-mode processes, ask for these services from the kernel, it has to, by definition.
The Syscall Table
The syscall table is an array in the kernel that holds a pointer to all the system calls (syscalls) the operating system has to offer.
void *sys_call_table[NR_syscalls] = {
[0 ... NR_syscalls-1] = sys_ni_syscall,
#include <asm/unistd.h>
};
As you can see above, the sys_call_table
is an array of size NR_syscalls
, which is a macro defined in the kernel and hold the maximum number of allowed syscalls. Also, all the elements in the syscall table are initialise to sys_ni_syscall
. Every syscall that isn't yet implemented gets redirected to sys_ni_syscall
. When a new syscall gets implemented, the offset reserved for that syscall, in sys_call_table
, will be changed to contain a pointer to the newly implemented syscall.
Obtaining the System Call Table Address
Finally, the fun part!
The main attempt of this article is to teach you how to hijack the syscall table, which means getting the syscall table’s address in the memory, so you will be able to play with it and abuse it in your own will.
In past versions of the Linux Kernel, there was an explicit variable for the syscall table with the nameSYSCALL_TABLE
, but it got removed for obvious reasons, so attackers had to think about new original ways to get the address of the syscall table. And here are a few ways to achieve this goal:
By Memory Seeking
The laziest (but not that efficient) way to search anything, is to just go through all the places it could be, and for every place you search, you compare the current address you are at and an address you already know (we’ll use the sys_close
reference).
For this method, we need a place to start, a memory address we know for sure is before the sys_call_table
address in the memory. Then we will go through the memory, in a loop, going up in the memory every iteration, until we will find the one place in the memory that when it’s getting added with a certain offset, it will point to an address we searched with (see the example for a better understanding).
Let’s break it down:
We initialise i
to the address of the sys_close()
function. We can trust that the function will be in a lower memory address than the syscall table because it is being loaded first to the memory at boot up. The loop will stop when i
will reach the maximum that unsigned long can reach, and that’s because it is the last memory address of the system. Every iteration we incrementing i
by the size of void *
.
On every iteration of the loop, as said earlier, we compare i
, which is now an address in the system, added with the offset of __NR_close
(the offset designated for sys_close)
, to the address of sys_close()
itself. If equals, it means we found the syscall table; If not, we continue to the next iteration.
By /proc/kallsyms
file
As you probably already know, everything in Linux is a file, and there is one particular file that will help us get what we want, and this file is the /proc/kallsyms
file.
The /proc/kallsyms
file is a special file that contains all the symbols of the dynamically loaded kernel modules and the static code’s symbols. In other words, it has the whole kernel mapping in one place. So now the algorithm is simple to think of, just read the file, and search for the sys_call_table
symbol.

Let’s talk about reading a file.
Reading a file is the act of putting a stream of data into a buffer, so you will be able to access that data.
As written above, there are protection rings that were made to guard data from being misused. Because of that, if you are in the kernel you may only read data from the kernel space, and if you are in the user space, you will read only user space data. Fortunately, there is a way to read user space data into a kernel module. All we have to do is to change the global variable addr_limit
. addr_limit
is the highest address unprivileged code is allowed to access, and if we change it, we could read data from where we want, including the user space. set_fs()
is the function we will use to do so. (Read more here).
Breaking it down:
First, we use set_fs()
to set the addr_limit
so we could read the /proc/kallsyms
file from the user space. After we read it, using vfs_read()
, we set addr_limit
back to the original value (That is very important, because if we won’t do it, any user mode process could manipulate the kernel address space).
From then on, it is straightforward. In every iteration, we check to see if the line contains the “sys_call_table” string, and if so, save the address of the table and return it.
By kallsyms_lookup_name()
And now for the funny part.
All that we have done until here to get the address of the system call table can be done by calling the function kallsyms_lookup_name()
, which is declared in linux/kallsyms.h
.
printk("The address of sys_call_table is: %lx\n", kallsyms_lookup_name("sys_call_table"));
There is not much to tell about this function, except it searches and returns the address of any symbol you search. Thus, you only need to search the name sys_call_table
and there you have it, the kernel’s system call table in your hands.
Hooking a syscall
After getting the syscall table, we will use it to create a syscall hook, which is changing the behaviour of a certain system call.
In a previous section, we talked about how the system call table is an array of addresses, and every single address the array contains is an address of a syscall. When a user asks for a syscall, the kernel goes to the syscall table, extracts the address of the syscall, and then instructs the CPU to go to that function. If we change the address saved in the table, the kernel will instruct the CPU to go into our function, and in there, we could do whatever we want.
cr0
There are all kind of registers in the CPU, and one kind is the “control register”. A control register is a register that holds flags that change the behaviour of the CPU. One of these registers is the cr0
register. One flag in cr0
is the WP
flag, the flag that tell the CPU if he may or may not write into a read-only sections in the memory (for the other flags, please visit the Wikipedia page of control register). When WP
is set to 1, the CPU may not write into a read-only page, but when set to 0, the CPU may write anything anywhere.
So, given the system call table is in a read only section, we need to change the WP
flag to 1 to change an address in there.
#define unprotect_memory() \
({ \
orig_cr0 = read_cr0();\
write_cr0(orig_cr0 & (~ 0x10000)); /* Set WP flag to 0 */ \
});#define protect_memory() \
({ \
write_cr0(orig_cr0); /* Set WP flag to 1 */ \
});
Above there are two macros used to change the WP
flag in the cr0
register to 0 or 1. We will use those macros when we do to the actual hooking.
The Actual Hooking
And now for the moment of truth, the actual hooking.
As you may see in lines 12 and 19, all we did to change the system call is change the address saved in the syscall table in to the address of our choosing.
And thats it. That is a system call hooking in the Linux kernel.
Things you may do with hooks
There are many options for what to do with hooks. For example, you may add some functionality to some syscalls or use hooking to create a kernel rootkit (which is probably why you are here). The most used hooks for rootkit is:
kill
— Mostly used to add some signals that can help communicate with your rootkit.getdirentries
— Used to hide files and processes.read
— Used for keylogging.shutdown
— To change the shutdown sequence.ioctl
— Change basicioctl
requests.
and so much more, depends on your imagination.
References
[1] About the kernel — The Linux Information Project
[2] bootlin — kernel source code by versions
[3] Diamorphine Rootkit — A simple Linux kernel rootkit

Written by:
@GoldenOak
@TSn0w