Locating the kernel PGD on Android/aarch64

by Vitaly Nikolenko

Posted on December 21, 2020 at 6:40PM

Thought I'd do a quick post explaining what I was trying to say in my tweet since all the questions/comments were mostly "but you need some arbitrary read to read that page" or "but you need to know where _text is on a KASLR enabled device." Yes and yes to both of these. I was referring to a post-exploitation technique where you already have some arbitrary/partial kernel read/write working.

Why would you want to locate the kernel PGD (read &swapper_pg_dir)? To bypass certain kernel exploitation mitigations on certain devices. If that doesn't make sense, go back and re-read KSMA (kernel space mirroring attack) or at least that's what the term was coined as on Android/aarch64 after it was simply called "modifying page tables" on x86 back in the days. But in vulnerability research, as you may know, everything old is new again.

Hopefully that clears up the intention to find the PGD. So in order to locate the kernel PGD (or to be specific the address where ttbr1 points to), you need to know where the _text starts. If you have access to kallsyms - great, just extract it from there. If not, search for the kernel magic (yes, kernel magic is real!) - generally start with a known text address ideally as close to the start of the text segment as possible and work your way backwards until you find the kernel magic value. The page containing the kernel magic has two useful values (1) kernel text offset and (2) kernel image size.

sunfish:/ # grep _text /proc/kallsyms | head -1
ffffff93e7880000 T _text

For example, on my test device with KASLR enabled, the second and third qwords below are the kernel text offset and image size. If you know the virtual address of _text or any virtual address within that page, the PGD can be located by adding the image size to the _text and subtracting two pages:

(gdb) x/4xg 0xffffff93e7880000
0xffffff93e7880000:     0x0000000014608000      0x0000000000080000
0xffffff93e7880010:     0x0000000002426000      0x000000000000000a
(gdb) x/xg 0xffffff93e7880000 + 0x0000000002426000 - 0x2000
0xffffff93e9ca4000:     0x00000001f303f003
(gdb) p/x &swapper_pg_dir
$1 = 0xffffff93e9ca4000

If the virtual address of _text segment is not known, the kernel PGD can be found in physmap as well. Assuming physmap is not randomised, we have the following:

(gdb) x/xg 0xffffffc000000000 + 0x0000000002086000 + 0x0000000000080000 - 0x2000
0xffffffc002104000:     0x00000001fa7fe003

where 0x80000 is the kernel text offset from above. If physmap is randomised, we can still search for certain markers (kernel magic?) and find the physmap slide.

Now that we know where the PGD is, let's walk the page tables manually as a demo. The goal is to find the PTE for 0xffffff93e7880000, i.e., current _text address on the device with 39-bit virtual address space (currently all mainstream Android mobile devices on the market) and no KASLR for simplicity.

First let's get the page table indices from the virtual address:

(0xffffff8008080000 & 0x7fc0000000) >> 30 => 0 (pgd / pud)
(0xffffff8008080000 & 0x3fe00000) >> 21 => 0x40 (pmd)
(0xffffff8008080000 & 0x1ff000) >> 12 => 0x80 (pte)

i.e., standard 39-bit VA with 4k pages configuration.

The physical address of the pmd is then 0x1fa7fe000 (bits 12:47 extracted from the pgd descriptor):

(gdb) x/3xg 0xffffff8008080000
0xffffff8008080000 <_text>:     0x0000000014608000      0x0000000000080000
0xffffff8008080010 <_text+16>:  0x0000000002086000
(gdb) x/xg 0xffffff8008080000 + 0x2086000 - 0x2000
0xffffff800a104000:     0x00000001fa7fe003

Since we're walking the page tables manually, we need to convert this physical address into the virtual (physmap) address:

(gdb) p/x memstart_addr
$1 = 0x80000000

This gives us the first physical address (where physical mapping starts). Then to convert this to a virtual address in physmap, we add the physmap offset and subtract the physical start offset shown above:

(gdb) p/x 0xffffffc000000000 + 0x1fa7fe000 - 0x80000000
$3 = 0xffffffc17a7fe000

This time, the pmd index is 0x40 and all descriptors (block, table, page) are 8 bytes:

(gdb) x/xg 0xffffffc17a7fe000 + 0x40 * 8
0xffffffc17a7fe200:     0x00000001fa7fd003

Finally, the page descriptor (at offset 0x80) can be obtained using the same approach:

(gdb) x/xg 0xffffffc000000000 + 0x00000001fa7fd000 - 0x80000000 + 0x80 * 8
0xffffffc17a7fd400:     0x00d0000080080793
(gdb) p/t 0x00d0000080080793
$4 = 11010000000000000000000010000000000010000000011110010011

Bits 6-7 in the page descriptor value correspond to r/w AP permissions. In our example, they're 10 meaning the page is readable from EL1 and non-accessible from EL0. Quiz time: what would happen if we replace these bits with 01?

GDB script that automates the walk described above and dumps PTE attributes can be found here.

(gdb) source ~/repos/gdb_scripts/page_table.py 
(gdb) get_pte &selinux_enforcing
Kernel image size = 0x2096000
Kernel PGD = 0xffffff800a114000
PGD offset = 0
PMD physical address = 0x1fa7fe000
PMD virtual address = 0xffffffc17a7fe270
--- PTE dump ---
PTE value = 0xe8000081c00711
AttrIndx = 100
NS = 0
AP = 00: R/W (EL1) and None (EL0)
SH = 11
AF = 1
nG = 0
Contiguous = 0
PXN = 1
UXN = 1
-- Software defined PTE bits --