Clone wiki

balbuzard / Transforms

Transforms for the Balbuzard tools:

This page describes the (de)obfuscation transforms included in the Balbuzard tools, and explains how to add your own transforms using simple Python scripts.

Transforms are used by bbcrack and bbharvest to detect when a malware sample uses an algorithm such as XOR to hide data or a payload embedded into a malicious document. The goal is to find which obfuscation algorithm and which keys or parameters have been used.

The Balbuzard tools include a number of obfuscation transforms found in existing malware over the past few years: XOR, ADD, ROL, XOR+ROL, XOR+ADD, ADD+XOR, XOR with incrementing key, XOR chained, etc.

Each transform may have one or several parameters. For example, the XOR transform has a parameter (key) that can vary from 0 to 255. Each byte (B) in data is transformed to "B XOR key".

Transforms are organized in three levels (1,2,3): Level 1 are the simplest/fastest transforms (such as XOR), level 2 are more complex transforms (such as XOR+ADD), and level 3 are less frequent or slower transforms. See below or run "bbcrack.py -t list" to check the full list..

Level 1:

  • identity: Identity Transformation, no change to data. Parameters: none.
  • xor: XOR with 8 bits static key A. Parameters: A (1-FF).
  • add: ADD with 8 bits static key A. Parameters: A (1-FF).
  • rol: ROL - rotate A bits left. Parameters: A (1-7).
  • xor_rol: XOR with static 8 bits key A, then rotate B bits left. Parameters: A (1-FF), B (1-7).
  • add_rol: ADD with static 8 bits key A, then rotate B bits left. Parameters: A (1-FF), B (1-7).
  • rol_add: rotate A bits left, then ADD with static 8 bits key B. Parameters: A (1-7), B (1-FF).

Level 2:

  • xor_add: XOR with 8 bits static key A, then ADD with 8 bits static key B. Parameters: A (1-FF), B (1-FF).
  • add_xor: ADD with 8 bits static key A, then XOR with 8 bits static key B. Parameters: A (1-FF), B (1-FF).
  • xor_inc: XOR with 8 bits key A incrementing after each character. Parameters: A (0-FF).
  • xor_dec: XOR with 8 bits key A decrementing after each character. Parameters: A (0-FF).
  • sub_inc: SUB with 8 bits key A incrementing after each character. Parameters: A (0-FF).
  • xor_chained: XOR with 8 bits key A chained with previous character. Parameters: A (1-FF).
  • xor_rchained: XOR with 8 bits key A chained with next character (Reverse order from end to start). Parameters: A (1-FF).

Level 3:

  • xor_inc_rol: XOR with 8 bits key A incrementing after each character, then rotate B bits left. Parameters: A (0-FF), B (1-7).
  • xor_rchained_all: XOR Transform, chained from the right with all following cha
    racters. Only works well with bbharvest.

How to extend the list of patterns and transforms

It is possible to extend Balbuzard with your own (de)obfuscation transforms, using simple Python scripts.

All transforms and plugins are shared by bbcrack, bbharvest and bbtrans.

If you develop useful plugin scripts and you would like me to reference them, or if you think about additional transforms that bbcrack should include, please contact me.

Transform plugin scripts must be stored in the plugins subfolder, with a name starting with "trans_". Read the contents of the provided script "trans_sample_plugin.py" for sample transforms that you can reuse.

First define a new Transform class, inheriting either from Transform_char or
Transform_string:

  • Transform_char: for transforms that apply to each character/byte independently, not depending on the location of the character. (example: simple XOR)
  • Transform_string: for all other transforms, that may apply to several characters at once, or taking into account the location of the character. (example: XOR with increasing key)

Transform_char is usually much faster because it uses a translation table, so the obfuscation algorithm is only used 256 times whatever the size of the data.

A Transform class represents a generic transform (obfuscation algorithm), such as XOR
or XOR+ROL.
When the class is instantiated as an object, it includes the keys of the
obfuscation algorithm, specified as parameters. (e.g. "XOR 4F" or "XOR 4F +
ROL 3")

For each transform class, you need to implement the following methods/variables:

  • a description and a short name for the transform
  • __init__() to store parameters
  • iter_params() to generate all the possible parameters for bruteforcing (e.g. 1 to 255 for XOR)
  • For a Transform_Char: transform_int() or transform_char() to apply the transform to a single character, either as an integer (ASCII code) or as a string of length 1. In most cases transform_int is the easiest solution.
  • For a Transform_string: transform_string() to apply the transform to the whole string at once.

Then do not forget to add the transform to the proper level 1, 2 or 3. (see below after
class samples)

Here are three different examples:

1) Transform_char with single parameter (e.g. XOR)

    class Transform_SAMPLE_XOR (Transform_char):
        # Provide a description for the transform, and an id (short name for
        # command line options):
        gen_name = 'SAMPLE XOR with 8 bits static key A. Parameters: A (1-FF).'
        gen_id   = 'samplexor'

        def __init__(self, params):
            # the __init__ method must store provided parameters and build the specific
            # name and shortname of the transform with parameters
            self.params = params
            self.name = "Sample XOR %02X" % params
            # this shortname will be used to save bbcrack and bbtrans results to files
            self.shortname = "samplexor%02X" % params

        def transform_int (self, i):
            # here params is an integer (the XOR key), and i the ASCII code of the character
            return i ^ self.params

        @staticmethod
        def iter_params ():
            # the XOR key can be 1 to 255 (0 would be identity)
            for key in xrange(1,256):
                yield key

2) Transform_char with multiple parameters (e.g. XOR+ROL)

    class Transform_SAMPLE_XOR_ROL (Transform_char):
        # generic name for the class:
        gen_name = 'XOR with static 8 bits key A, then rotate B bits left. Parameters: A (1-FF), B (1-7).'
        gen_id   = 'xor_rol'

        def __init__(self, params):
            # Here we assume that params is a tuple with two integers:
            self.params = params
            self.name = "XOR %02X then ROL %d" % params
            self.shortname = "xor%02X_rol%d" % params

        def transform_int (self, i):
            # here params is a tuple
            xor_key, rol_bits = self.params
            # rol() is defined in bbcrack.py
            return rol(i ^ xor_key, rol_bits)

        @staticmethod
        def iter_params ():
            "return (XOR key, ROL bits)"
            # the XOR key can be 1 to 255 (0 would be like ROL)
            for xor_key in xrange(1,256):
                # the ROL bits can be 1 to 7:
                for rol_bits in xrange(1,8):
                    # yield a tuple with XOR key and ROL bits:
                    yield (xor_key, rol_bits)

3) Transform_string

    class Transform_SAMPLE_XOR_INC (Transform_string):
        """
        Sample XOR Transform, with incrementing key
        (this kind of transform must be implemented as a Transform_string, because
        it gives different results depending on the location of the character)
        """
        # generic name for the class:
        gen_name = 'XOR with 8 bits key A incrementing after each character. Parameters: A (0-FF).'
        gen_id   = 'xor_inc'

        def __init__(self, params):
            self.params = params
            self.name = "XOR %02X INC" % params
            self.shortname = "xor%02X_inc" % params

        def transform_string (self, data):
            """
            Method to be overloaded, only for a transform that acts on a string
            globally.
            This method should apply the transform to the data string, using params
            as parameters, and return the transformed data as a string.
            (the resulting string does not need to have the same length as data)
            """
            # here params is an integer
            out = ''
            for i in xrange(len(data)):
                xor_key = (self.params + i) & 0xFF
                out += chr(ord(data[i]) ^ xor_key)
            return out

        @staticmethod
        def iter_params ():
            # the XOR key can be 0 to 255 (0 is not identity here)
            for xor_key in xrange(0,256):
                yield xor_key

Adding transforms to level 1, 2 or 3

Then, do not forget to add new transforms to the proper level, otherwise the tools will not use them:

  • level 1 for fast transforms with up to 2000 iterations (e.g. xor, xor+rol)
  • level 2 for slower transforms or more iterations (e.g. xor+add)
  • level 3 for slow or infrequent transforms

For this, call the function add_transform with the chosen level:

    add_transform(Transform_SAMPLE_XOR, level=1)
    add_transform(Transform_SAMPLE_XOR_ROL, level=1)
    add_transform(Transform_SAMPLE_XOR_INC, level=2)

See the provided script "trans_sample_plugin.py" for sample transforms that you can reuse.

See the source code in bbcrack.py and the Transform classes for more options and examples.


Documentation pages

Updated