Hacking Java Bytecode for Programmers (Part2) – Lions, and Tigers, and OP Codes, OH MY!

Index Link to heading

Introduction Link to heading

In Part 1, I showed you the basics of Hexadecimal, Hex Editors, and Java Bytecode. Refer to that post if you need to catch up. The following will be a simple exercise showing you how to manipulate the Java Bytecode directly.

GOAAAAAAAAAAAAAAAL! Link to heading

Lets set ourselves up with a goal for this exercise.

You should have the following User.java file on your system.

public class User {
 
        protected int status = 0;
 
        public boolean setStatusTrue() {
                return this.status == 1;
        }
 
        public static void main(String[] args) {
           System.out.println("Hacking Java Bytecode!");
        }
 
}

Which we compiled using javac.

$ javac User.java

Thus, we should also have a User.class file.

$ ls
User.class  User.java
$

At this point, lets pretend that we were never given the source file. So for clarity, rename the source file to User.java.del.

$ mv User.java User.java.del
$ ls
User.class  User.java.del

In this scenario, despite the fact that you do not have the source code, you still have the compiled class file that the JVM can execute. Lets run it now.

$ java User 
Hacking Java Bytecode!
$

Our goal will be to change the output from Hacking Java Bytecode! to l33t hax0r bro by modifying only the compiled source.

Understanding Java Opcodes (Operation Codes) Link to heading

When we compiled our code using javac, it took the human readable goodness that we cooked up.

public class User {
 
        protected int status = 0;
 
        public boolean setStatusTrue() {
                return this.status == 1;
        }
 
        public static void main(String[] args) {
           System.out.println("Hacking Java Bytecode!");
        }
 
}

And turned it into the computer digestible binary awesome-sauce that the JVM needs. Below seen in hexadecimal using xxd.

$ xxd User.class
0000000: cafe babe 0000 0033 0024 0a00 0700 1509  .......3.$......
0000010: 0006 0016 0900 1700 1808 0019 0a00 1a00  ................
0000020: 1b07 001c 0700 1d01 0006 7374 6174 7573  ..........status
0000030: 0100 0149 0100 063c 696e 6974 3e01 0003  ...I......
0000040: 2829 5601 0004 436f 6465 0100 0f4c 696e  ()V...Code...Lin
0000050: 654e 756d 6265 7254 6162 6c65 0100 0d73  eNumberTable...s
0000060: 6574 5374 6174 7573 5472 7565 0100 0328  etStatusTrue...(
0000070: 295a 0100 0d53 7461 636b 4d61 7054 6162  )Z...StackMapTab
0000080: 6c65 0100 046d 6169 6e01 0016 285b 4c6a  le...main...([Lj
0000090: 6176 612f 6c61 6e67 2f53 7472 696e 673b  ava/lang/String;
00000a0: 2956 0100 0a53 6f75 7263 6546 696c 6501  )V...SourceFile.
00000b0: 0009 5573 6572 2e6a 6176 610c 000a 000b  ..User.java.....
00000c0: 0c00 0800 0907 001e 0c00 1f00 2001 0016  ............ ...
00000d0: 4861 636b 696e 6720 4a61 7661 2042 7974  Hacking Java Byt
00000e0: 6563 6f64 6521 0700 210c 0022 0023 0100  ecode!..!..".#..
00000f0: 0455 7365 7201 0010 6a61 7661 2f6c 616e  .User...java/lan
0000100: 672f 4f62 6a65 6374 0100 106a 6176 612f  g/Object...java/
0000110: 6c61 6e67 2f53 7973 7465 6d01 0003 6f75  lang/System...ou
0000120: 7401 0015 4c6a 6176 612f 696f 2f50 7269  t...Ljava/io/Pri
0000130: 6e74 5374 7265 616d 3b01 0013 6a61 7661  ntStream;...java
0000140: 2f69 6f2f 5072 696e 7453 7472 6561 6d01  /io/PrintStream.
0000150: 0007 7072 696e 746c 6e01 0015 284c 6a61  ..println...(Lja
0000160: 7661 2f6c 616e 672f 5374 7269 6e67 3b29  va/lang/String;)
0000170: 5600 2100 0600 0700 0000 0100 0400 0800  V.!.............
0000180: 0900 0000 0300 0100 0a00 0b00 0100 0c00  ................
0000190: 0000 2600 0200 0100 0000 0a2a b700 012a  ..&........*...*
00001a0: 03b5 0002 b100 0000 0100 0d00 0000 0a00  ................
00001b0: 0200 0000 0100 0400 0300 0100 0e00 0f00  ................
00001c0: 0100 0c00 0000 3100 0200 0100 0000 0e2a  ......1........*
00001d0: b400 0204 a000 0704 a700 0403 ac00 0000  ................
00001e0: 0200 0d00 0000 0600 0100 0000 0600 1000  ................
00001f0: 0000 0500 020c 4001 0009 0011 0012 0001  ......@.........
0000200: 000c 0000 0025 0002 0001 0000 0009 b200  .....%..........
0000210: 0312 04b6 0005 b100 0000 0100 0d00 0000  ................
0000220: 0a00 0200 0000 0a00 0800 0b00 0100 1300  ................
0000230: 0000 0200 14

There are two primary things to know concerning compiled Java code.

The first is that Opcodes (Operational Codes) created by the compiler are simply optimized and formatted instruction sets telling the JVM what to do. In programmer speak, they are reserved words that javac created on compilation.

Just for example, lets randomly take a look at byte 0x19 found on offset line 0x10 when we dumped our User.class file with xxd.

0000010: 0006 0016 0900 1700 1808 0019 0a00 1a00 ................

A logical question would be.

“Jared, how do we know that 0x19 is an instruction Opcode and how do we know what it actually does?”

Lucky for us we can use a Java Bytecode Reference which tells us that the mnemonic for 0x19 is aload.

This will lead to the next question.

“What is a mnemonic?”

Mnemonics are simply a way of organization. It is the process of taking something hard to remember (0x19) and associating it with something easier to remember (aload). You can think of mnemonics as a simple conversation.

You “What is the Opcode for aload?”

Computer 0x19 is the Opcode you are looking for.”

Another way I visualize the functionality of a particular Opcode, is as a simple procedural function. Here is how it could look using Python.

def aload(stack, pointer):
    # load an object onto the stack
    stack.append(pointer)
    print "Object loaded!"
 
stack = []
pointer = "A string pretending to be an object"
aload(stack, pointer)
$ python aload.py 
Object loaded!
$

For now, we are not going to focus on the the low level details of Opcodes. We just need to be aware at a high level, that the JVM creates them upon compilation, that they are super important, and that we don’t want to accidentally squish them when we attempt to hack on the ASCII hexcode text.

The second is that we are accessing the binary data using a hexadecimal tool (xxd, Bless). Because of this, we need a comparison of Hacking Java Bytecode! in both ASCII and hexadecimal.

 H  a  c  k  i  n  g \s  J  a  v  a \s  B  y  t  e  c  o  d  e  !
48 61 63 6B 69 6E 67 20 4A 61 76 61 20 42 79 74 65 63 6F 64 65 21

Each hexadecimal number correspondes with the letter or special character in the example above. Also, an important tool you will rely on is a good calculator that easily converts ASCII, Binary, and Hexadecimal.

Bring on the hack Link to heading

Now, even though the data is no longer ideal for human consumption or manipulation, this doesn’t mean we are actually prevented from hacking on it. It just means it will take a bit more work.

Take a look at these three lines from the xxd output of our User.class file.

00000c0: 0c00 0800 0907 001e 0c00 1f00 2001 0016  ............ ...
00000d0: 4861 636b 696e 6720 4a61 7661 2042 7974  Hacking Java Byt
00000e0: 6563 6f64 6521 0700 210c 0022 0023 0100  ecode!..!..".#..

You’ll notice that ASCII has been printed on the right, showing us these lines contain the data we are trying to manipulate.

Lets open up the User.class file with Bless. I’ve taken the liberty of highlighting the lines of interest in Bless.

bytecode1

You can use a tool like the following or hexdump from the command line to convert your string to hex.

What we need to do is replace the hexadecimal for Hacking Java Bytecode! with the hexadecimal for l33t hax0r bro. Below we can echo out the text to get their hex representations.

$ echo "Hacking Java Bytecode\!" | hexdump -v -e '/1 "%02X "' ; echo
48 61 63 6B 69 6E 67 20 4A 61 76 61 20 42 79 74 65 63 6F 64 65 21 0A 
$
$ echo "l33t hax0r bro" | hexdump -v -e '/1 "%02X "' ; echo
6C 33 33 74 20 68 61 78 30 72 20 62 72 6F
$

Go ahead and run the program again.

$ java User 
Hacking Java Bytecode!
$

Now take the output of the previous hexdump from l33t hax0r bro and paste it over the hex for Hacking Java Bytecode! in Bless.

bytecode1

Now run your program again.

$ java User
Exception in thread "main" java.lang.ClassFormatError: Illegal UTF8 string in constant pool in class file User
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:787)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:447)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
	at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482)
$

Doh! What happened?

It is rather simple. When you compiled User.class using javac, the compiler took a count of all the characters in that string and prepended a value in hexadecimal to help validate the strings length.

Let’s count the characters.

 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22
 H  a  c  k  i  n  g \s  J  a  v  a \s  B  y  t  e  c  o  d  e  !

This shows us that our original string has twenty-two characters and if we convert the number 22 to hexadecimal we get 0x16. And if you look at the beginning of our line in Bless, you will see that exact value.

bytecode

The problem is that now we have changed that string. Which means we need to also change that character count to match the new string.

 1  2  3  4  5  6  7  8  9 10 11 12 13 14
 l  3  3  t \s  h  a  x  0  r \s  b  r  o

You can see that our new string has fourteen characters. So if we convert 14 to hexadecimal we get 0x0E. Now you just need to replace that 0x16 byte with the 0x0E byte and save the file.

bytecode

Run the command again!

$ java User
l33t hax0r bro
$

Success!!!

Conclusion Link to heading

You should now have a decent understanding of JavaBytecode, High Level comprehension concerning Java Opcodes, and the ability to manipulate basic strings inside of compiled source.

In our next installment we are going to talk about more advanced techniques in how to bypass certain code blocks by manipulating the Bytecode directly. We will also start discussing tools that will help us in our effort of reverse engineering.