Issue #6 commented on in achurch_/libbinrec
libbinrec: a recompiling translator for machine code ==================================================== Copyright (c) 2016 Andrew Church <email@example.com> See the file "COPYING" for conditions on use and redistribution. Version: 0.1 Overview -------- libbinrec is a library for translating machine code from one CPU architecture into equivalent and efficient machine code for a different architecture, such as when emulating one CPU on another. libbinrec is designed primarily for use in just-in-time (JIT) compilation, which trades faster execution on average (translated code can approach a 1:1 source-to-native instruction ratio, as opposed to the dozens of native instructions which may be required to interpret each source instruction on the fly) for additional overhead when executing a block of code for the first time. To this end, the library is written so that in the default state (with no optimizations enabled), it will produce code that runs both accurately and reasonably quickly while keeping compilation time to a minimum. The library also provides several optimization flags which allow trading longer compilation time for faster execution or disabling emulation of rarely-used features of the guest architecture (such as floating-point status bits) to streamline the generated code. While libbinrec is designed for use in a JIT environment, it can be used in any situation requiring translation of machine code between supported architectures. The machine code produced by the library is independent of both libbinrec itself and the environment in which it is run: a client program encountering a new guest code sequence could translate it once with no optimization to quickly obtain runnable code, then signal a background process to translate the same code at a higher optimization level and switch to that code when it is ready. (This is true even if the background process is running on a separate computer with a different architecture; it is entirely possible, for example, to translate from PowerPC to x86-64 while running on a 32-bit ARM CPU.) By implication, the generated code is also fully relocatable, so it can be safely stored in a persistent cache and reused for later invocations of the same program. Currently, libbinrec only supports the PowerPC 32-bit instruction set implemented in PowerPC 6xx/7xx processors as input and 64-bit x86 machine code as output. Requirements ------------ libbinrec is written in pure C99 (with some additional compiler- and architecture-specific code used to improve performance) and should compile with any standards-compliant compiler. The included Makefile is written for GNU Make and supports building on Linux and other Unix-like systems, Mac OS X, or Windows (MinGW), using Clang, GCC, or the Intel C compiler. libbinrec also includes a CMake (see <https://cmake.org/>) control file for ease of integration into CMake-based projects. libbinrec makes the following assumptions (beyond what is required by C99) about the execution environment: - The environment uses two's-complement representations for signed integer values. - The exact-size integer types int8_t, int16_t, int32_t, and int64_t exist. - The "int" and "intptr_t" types are at least 32 bits wide. - The "float" and "double" types use IEEE 754 floating-point format. There are currently no plans to support environments for which these do not all hold. The included tests assume for convenience that a null pointer is represented by an all-zero bit pattern. Building -------- libbinrec can be built by simply running the "make" command (or "gmake", if GNU Make is installed under that name on your system) in the top directory of the libbinrec distribution. This will create shared and static library files in the top directory, which can then be installed on the system with "make install". Several configuration variables are available to control the build process or specify nonstandard paths for dependent libraries. These can be set on the "make" command line; for example, "make ENABLE_ASSERT=1". See the "Configuration" section at the top of the "Makefile" file for details. Using libbinrec --------------- See the documentation in include/binrec.h for details. When using libbinrec from a C++ program, the C++ header (binrec++.h) may be more convenient; this header provides a "binrec" namespace for all constants and functions, and wraps handle-specific functions in a C++ class for convenience. Performance ----------- The following figures were obtained by running the included benchmark tool (which can be built with "make benchmarks/bench") on an Intel Core i7-4770S running at a clock speed of 3.1GHz, using the System V ABI. In all cases, translated code was cached using a flat array of pointers to translated units (one entry per byte in the guest address space). The benchmark program and guest binaries were built using GCC 5.4 and GNU binutils 2.26.1, using the default command-line options set in the Makefile and etc/* scripts. Architecture |dhry_noopt | dhry_opt |whet_noopt | whet_opt | /optimization |(MDhry/sec)|(MDhry/sec)| (MWIPS) | (MWIPS) | ---------------+-----------+-----------+-----------+-----------+ native | 11.9 | 30.8 | 4940 | 7270 | ppc32 -O0 | 2.04 | 2.89 | 109 | 185 | ppc32 -O1 | 2.56 | 3.53 | 115 | 195 | ppc32 -O2 | 3.11 | 4.05 | 123 | 196 | + -fchain | 3.17 | 4.14 | 126 | 200 | + -ffast-math | ----- | ----- | 1780 | 3250 | The order-of-magnitude difference from enabling -ffast-math for the PowerPC Whetstone benchmark results from the heavy cost of updating FPSCR (the Floating-Point Status and Control Register) after each floating-point instruction. The rules for setting bits in this register are fairly complex, and the code required to implement the register correctly is difficult for libbinrec to optimize, resulting in a significant performance penalty. Disabling updates to this register brings the translated code much closer to native performance, at the cost of no support for unmasked floating-point exceptions and certain minor differences in behavior (such as different NaN payloads in response to an invalid-operation exception) from an actual PowerPC CPU. Limitations ----------- libbinrec assumes that the input program it is given is valid machine code for the selected input architecture (referred to as the "guest" architecture in library documentation). The library will correctly detect and translate illegal instructions explicitly reserved as such by the architecture, such as the all-zero PowerPC instruction word, but other byte sequences which do not repesent a valid instruction for the guest architecture may result in behavior different from a physical implementation of that architecture. As a consequence, libbinrec may not correctly translate programs which include deliberately invalid instruction encodings, such as to take advantage of undocumented behavior of a specific architecture implementation. More generally, libbinrec is designed to translate between architectures rather than specific implementations; thus, as a rule, input code which triggers behavior undefined in the architecture specification may behave differently when translated by libbinrec than when run on a physical CPU implementing the guest architecture. For example, the PowerPC integer divide instructions specify that the contents of the destination register are undefined after divison by zero; libbinrec deliberately skips updating the register state in this case, so the value eventually stored back to the processor state block will depend on the structure of the translated code. This likely differs from the behavior of an actual PowerPC CPU, but libbinrec does not attempt to match that behavior. (There are a few exceptions to this rule, such as frC rounding for PowerPC floating-point multiply operations; see the guest-specific documentation for details.) libbinrec is intended as a pure program translator rather than a full-fledged architecture emulator, and consequently it does not support translation of privileged instructions in the input program. Non-privileged instructions which explicitly transfer control to a privileged handler, such as the PowerPC "sc" (system call) instruction, are translated by inserting a call to a handler function supplied by the library client. Similarly, libbinrec has no facility for handling memory-mapped I/O; if such support is needed, the client program should mark relevant regions of the guest memory space as unmapped on the host system and use unmapped-access host exceptions to detect and handle such accesses. libbinrec also does not attempt to count instruction cycles or otherwise reproduce precise timing behavior of the guest processor. libbinrec does not, in general, provide support for any high-level exceptions (C++ exceptions, for example), so client programs should take care that functions which can be called from generated code do not raise such exceptions, or provide appropriate wrappers for the platform and language which insulate such exceptions from generated code. As a convenience, libbinrec includes a mode for x86-64 Windows output (BINREC_ARCH_X86_64_WINDOWS_SEH) which follows the Windows-specific structured exception handling ABI; however, this ABI disallows certain optimizations which libbinrec normally makes use of, so code generated in this mode will generally perform worse than code generated in a non-SEH mode. See the documentation in include/binrec.h for limitations and other details of behavior with respect to individual architectures. Reporting bugs -------------- Please send any bug reports or suggestions directly to the author.