HTTPS SSH
libbinrec: a recompiling translator for machine code
====================================================
Copyright (c) 2016 Andrew Church <achurch@achurch.org>
See the file "COPYING" for conditions on use and redistribution.

Version: 0.1


Overview
--------
libbinrec is a library for translating machine code from one CPU
architecture into equivalent and efficient machine code for a
different architecture, such as when emulating one CPU on another.
libbinrec is designed primarily for use in just-in-time (JIT)
compilation, which trades faster execution on average (translated code
can approach a 1:1 source-to-native instruction ratio, as opposed to
the dozens of native instructions which may be required to interpret
each source instruction on the fly) for additional overhead when
executing a block of code for the first time.  To this end, the library
is written so that in the default state (with no optimizations enabled),
it will produce code that runs both accurately and reasonably quickly
while keeping compilation time to a minimum.  The library also provides
several optimization flags which allow trading longer compilation time
for faster execution or disabling emulation of rarely-used features of
the guest architecture (such as floating-point status bits) to
streamline the generated code.

While libbinrec is designed for use in a JIT environment, it can be used
in any situation requiring translation of machine code between supported
architectures.  The machine code produced by the library is independent
of both libbinrec itself and the environment in which it is run: a
client program encountering a new guest code sequence could translate it
once with no optimization to quickly obtain runnable code, then signal a
background process to translate the same code at a higher optimization
level and switch to that code when it is ready.  (This is true even if
the background process is running on a separate computer with a
different architecture; it is entirely possible, for example, to
translate from PowerPC to x86-64 while running on a 32-bit ARM CPU.)
By implication, the generated code is also fully relocatable, so it can
be safely stored in a persistent cache and reused for later invocations
of the same program.

Currently, libbinrec only supports the PowerPC 32-bit instruction set
implemented in PowerPC 6xx/7xx processors as input and 64-bit x86
machine code as output.


Requirements
------------
libbinrec is written in pure C99 (with some additional compiler- and
architecture-specific code used to improve performance) and should
compile with any standards-compliant compiler.  The included Makefile is
written for GNU Make and supports building on Linux and other Unix-like
systems, Mac OS X, or Windows (MinGW), using Clang, GCC, or the Intel C
compiler.  libbinrec also includes a CMake (see <https://cmake.org/>)
control file for ease of integration into CMake-based projects.

libbinrec makes the following assumptions (beyond what is required by
C99) about the execution environment:
   - The environment uses two's-complement representations for signed
     integer values.
   - The exact-size integer types int8_t, int16_t, int32_t, and int64_t
     exist.
   - The "int" and "intptr_t" types are at least 32 bits wide.
   - The "float" and "double" types use IEEE 754 floating-point format.
There are currently no plans to support environments for which these do
not all hold.

The included tests assume for convenience that a null pointer is
represented by an all-zero bit pattern.


Building
--------
libbinrec can be built by simply running the "make" command (or "gmake",
if GNU Make is installed under that name on your system) in the top
directory of the libbinrec distribution.  This will create shared and
static library files in the top directory, which can then be installed
on the system with "make install".

Several configuration variables are available to control the build
process or specify nonstandard paths for dependent libraries.  These can
be set on the "make" command line; for example, "make ENABLE_ASSERT=1".
See the "Configuration" section at the top of the "Makefile" file for
details.


Using libbinrec
---------------
See the documentation in include/binrec.h for details.

When using libbinrec from a C++ program, the C++ header (binrec++.h) may
be more convenient; this header provides a "binrec" namespace for all
constants and functions, and wraps handle-specific functions in a C++
class for convenience.


Performance
-----------
The following figures were obtained by running the included benchmark
tool (which can be built with "make benchmarks/bench") on an Intel Core
i7-4770S running at a clock speed of 3.1GHz, using the System V ABI.  In
all cases, translated code was cached using a flat array of pointers to
translated units (one entry per byte in the guest address space).  The
benchmark program and guest binaries were built using GCC 5.4 and GNU
binutils 2.26.1, using the default command-line options set in the
Makefile and etc/* scripts.

    Architecture  |dhry_noopt | dhry_opt  |whet_noopt | whet_opt  |
    /optimization |(MDhry/sec)|(MDhry/sec)|  (MWIPS)  |  (MWIPS)  |
   ---------------+-----------+-----------+-----------+-----------+
    native        |   11.9    |   30.8    |   4940    |    7270   |
    ppc32 -O0     |    2.04   |    2.89   |    109    |     185   |
    ppc32 -O1     |    2.56   |    3.53   |    115    |     195   |
    ppc32 -O2     |    3.11   |    4.05   |    123    |     196   |
    + -fchain     |    3.17   |    4.14   |    126    |     200   |
    + -ffast-math |   -----   |   -----   |   1780    |    3250   |

The order-of-magnitude difference from enabling -ffast-math for the
PowerPC Whetstone benchmark results from the heavy cost of updating
FPSCR (the Floating-Point Status and Control Register) after each
floating-point instruction.  The rules for setting bits in this register
are fairly complex, and the code required to implement the register
correctly is difficult for libbinrec to optimize, resulting in a
significant performance penalty.  Disabling updates to this register
brings the translated code much closer to native performance, at the
cost of no support for unmasked floating-point exceptions and certain
minor differences in behavior (such as different NaN payloads in
response to an invalid-operation exception) from an actual PowerPC CPU.


Limitations
-----------
libbinrec assumes that the input program it is given is valid machine
code for the selected input architecture (referred to as the "guest"
architecture in library documentation).  The library will correctly
detect and translate illegal instructions explicitly reserved as such
by the architecture, such as the all-zero PowerPC instruction word, but
other byte sequences which do not repesent a valid instruction for the
guest architecture may result in behavior different from a physical
implementation of that architecture.  As a consequence, libbinrec may
not correctly translate programs which include deliberately invalid
instruction encodings, such as to take advantage of undocumented
behavior of a specific architecture implementation.

More generally, libbinrec is designed to translate between architectures
rather than specific implementations; thus, as a rule, input code which
triggers behavior undefined in the architecture specification may behave
differently when translated by libbinrec than when run on a physical CPU
implementing the guest architecture.  For example, the PowerPC integer
divide instructions specify that the contents of the destination
register are undefined after divison by zero; libbinrec deliberately
skips updating the register state in this case, so the value eventually
stored back to the processor state block will depend on the structure of
the translated code.  This likely differs from the behavior of an actual
PowerPC CPU, but libbinrec does not attempt to match that behavior.
(There are a few exceptions to this rule, such as frC rounding for
PowerPC floating-point multiply operations; see the guest-specific
documentation for details.)

libbinrec is intended as a pure program translator rather than a
full-fledged architecture emulator, and consequently it does not support
translation of privileged instructions in the input program.
Non-privileged instructions which explicitly transfer control to a
privileged handler, such as the PowerPC "sc" (system call) instruction,
are translated by inserting a call to a handler function supplied by the
library client.  Similarly, libbinrec has no facility for handling
memory-mapped I/O; if such support is needed, the client program should
mark relevant regions of the guest memory space as unmapped on the host
system and use unmapped-access host exceptions to detect and handle such
accesses.  libbinrec also does not attempt to count instruction cycles
or otherwise reproduce precise timing behavior of the guest processor.

libbinrec does not, in general, provide support for any high-level
exceptions (C++ exceptions, for example), so client programs should take
care that functions which can be called from generated code do not raise
such exceptions, or provide appropriate wrappers for the platform and
language which insulate such exceptions from generated code.  As a
convenience, libbinrec includes a mode for x86-64 Windows output
(BINREC_ARCH_X86_64_WINDOWS_SEH) which follows the Windows-specific
structured exception handling ABI; however, this ABI disallows certain
optimizations which libbinrec normally makes use of, so code generated
in this mode will generally perform worse than code generated in a
non-SEH mode.

See the documentation in include/binrec.h for limitations and other
details of behavior with respect to individual architectures.


Reporting bugs
--------------
Please send any bug reports or suggestions directly to the author.