Loading, editing and creating JVM ClassFiles with Jawa

Jawa is a rainy-day project to support inspecting, modifying, and creating JVM bytecode from Python. It's a successor to an earlier project from 2010 which was used to magically parse new versions of Minecraft and find new network packets, entities, sounds, etc. It did this by looking for patterns in the bytecode and reconstructing higher-level objects based on what it found.

These days, there are popular new tools like Krakatau for producing human-readable output, but this kind of project isn't always the best option when you actually react on the results.

Creation

Jawa can construct brand new ClassFiles from scratch - lets try the classic "Hello World!" example:

#!/usr/bin/env python
# -*- coding: utf8 -*-
"""
An example showing how to create a "Hello World" class from scratch.
"""
from jawa import ClassFile
from jawa.assemble import assemble

if __name__ == '__main__':
    cf = ClassFile.create('HelloWorld')

    main = cf.methods.create('main', '([Ljava/lang/String;)V', code=True)
    main.access_flags.acc_static = True
    main.code.max_locals = 1
    main.code.max_stack = 2

    main.code.assemble(assemble([
        ('getstatic', cf.constants.create_field_ref(
            'java/lang/System',
            'out',
            'Ljava/io/PrintStream;'
        )),
        ('ldc', cf.constants.create_string('Hello World!')),
        ('invokevirtual', cf.constants.create_method_ref(
            'java/io/PrintStream',
            'println',
            '(Ljava/lang/String;)V'
        )),
        ('return',)
    ]))

    with open('HelloWorld.class', 'wb') as fout:
        cf.save(fout)

Now lets give it a try:

» java HelloWorld
Hello World!

Success! Just like that, we've assembled a Class that the JVM will happily run. You can compare this to the Jasmin "Hello World!" example, the defacto standard for JVM assembly syntax. Both examples are equally compact and concise. We accomplish this by using the assemble() helper which provides support for psuedo-assembly (including named labels and branches), generating a stream of Instruction and Operand objects. This is what it would look like without that helper:

from jawa.bytecode import Instruction, Operand, OperandTypes

main.code.assemble([
    Instruction.from_mnemonic('getstatic', [
        Operand(
            OperandTypes.CONSTANT_INDEX,
            cf.constants.create_field_ref(
                'java/lang/System',
                'out',
                'Ljava/io/PrintStream'
            ).index
        )
    ],
    ...
])

This is extremely precise and will always result in bytecode exactly as provided (even when it's wrong), but you would quickly go insane doing this by hand so it's recommended to always use the assemble() helper.

Modification

We can also easily modify existing classes. Lets take our Hello World! example from the last section and turn it into a Hello Mars! example.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys

from jawa import ClassFile
from jawa.assemble import assemble


def main():
    with open(sys.argv[1], 'rb') as fin:
        cf = ClassFile(fin)

        # We aren't doing HelloWorld any more, so lets fix the name of
        # our class!
        cf.this = cf.constants.create_class('HelloMars')

        # We could just modify the "hello world!" string in the constant
        # pool, but where is the fun in that? Instead, we're going to
        # disassemble the main method, find the 'ldc' that loads the string
        # constant to the stack, and change it to point to a new constant.
        main = cf.methods.find_one(name='main')

        new_main = []
        for instruction in main.code.disassemble():
            if instruction.mnemonic == 'ldc':
                # We could build an Instruction and Operand object ourselves,
                # or use the `assemble()` utility to do it for us.
                new_main.extend(
                    assemble((
                        ('ldc', cf.constants.create_string('Hello Mars!')),
                    ))
                )
            else:
                # We only wanted to patch the 'ldc', everything else we want
                # to keep.
                new_main.append(instruction)

        main.code.assemble(new_main)

        with open('HelloMars.class', 'wb') as fout:
            cf.save(fout)


if __name__ == '__main__':
    sys.exit(main())

Lets give our newly modified class a try:

» java HelloMars
Hello Mars!

Success!

More!

Jawa has extensive documentation - give it a try.