1. Anton Golov
  2. RiscVsCiscPaper


RiscVsCiscPaper / paper.tex

\usepackage{amssymb, amsmath}

\title{RISC vs CISC}


    In assembly code, certain combinations of operations occur fairly often.
    For instance, the result of a binary operation must often immediately be
    written back to memory, and returning from a function call is generally a
    standardised procedure.  The designer of a CPU instruction set can decide to
    provide assembly instructions that perform an entire action.  Alternatively,
    he can require that the individual instructions be written out.

    When an implementer chooses to provide instructions that do multiple things,
    we speak of a \textbf{complex instruction set computer}.  Instruction sets
    based on this tend to support high-level programming constructs by providing
    more control flow features.  This makes it easier to write assembly code by
    hand; however, around 1970 it became clear that most compilers only utilised
    a small subset of the provided features.

    In the opposite case, when an instruction set provides a small set of core
    features, we speak of a \textbf{reduced instruction set computer}.  Due to
    the smaller number of instructions, the resulting CPU design is simpler and
    the layout of the instructions can be more uniform.  However, an assembly
    programmer working with such an instruction set would have to spend
    considerable time writing things that would be a single instruction on a

    In this paper, we shall look in more depth at the patterns generally
    followed by RISCs and CISCs.  As an example on the CISC side we will use the
    x86 architecture, which is the most common architecture used in personal
    computers today.  As examples on the RISC side, we will use ARM, an
    architecture commonly used in smartphones.

    % Are we still going to use this?
    \chapter{The Origin of CISC}

    The first computer that stored its program in memory was the Manchester
    Small Scale Experimental Machine\cite{mssem}, which was made in 1948.  The
    first compiler appeared four years later\cite{first-compiler}, but the use
    of assembly language for programming persisted for significantly longer.

    With a significant portion of assembly code being written directly by
    programmers, as opposed to being autogenertaed, architecture designers
    would often optimise for the ease of the programmer at the expense of the
    complexity of the CPU\@.  This was often done by combining arithmetic and
    load/store operations, providing advanced forms of control flow, and the
    usage of more complex addressing modes.

    Apart from making the job of the assembly programmer easier, the lower
    number of instructions necessary to achieve the desired result resulted in
    a smaller code footprint.  As cache technology was still rather limited and
    main memory access was a bottleneck for execution speed, this was a more
    significant factor than the time required to decode an instruction.

    The 8086 processor provides examples of all three of the above methods.  The
    addition, subtraction and comparison operatiors could compare an immediate
    value to a value to stored in memory,  and apart from the standard
    conditional and unconditional jump instructions, the 8086 had instructions
    for looping on a condition.  Furthermore, the 8086 supported memory
    segmentation, with jump and return instructions having different forms for
    operating directly or indirectly, and within a segment or

    % No, that's not actually a word.

    Due to the late arrival of the term RISC, early architectures were not
    categorised as neither CISC nor RISC\@.  In the case of the earliest
    computers, the limited opcode size lead to computers having a fairly limited
    number of instructions; furthermore, the technology was simply not
    available in order to do complex instruction decoding. % CITE!
    % I'm not even sure this is true, going by pdp-11-cisc


    % ...
    % ...
    % ...


    % My part: differences in software, hardware, memory usage
    We shall now give a more in-depth comparison of RISC and CISC, starting at
    a fairly high level and moving down towards the CPU internals, and finally
    demonstrating some performance implications of these differences.

    Some differences between RISC and CISC can be vaguely visible on the level
    of C code.  It is more common for RISC architectures to be based on the
    Harvard Architecture than for CISC systems\cite{arxiv-cisc-risc}.  Certain
    operations that would work on a Von Neumann Architecture are unavailable:
    for example, if function pointers are of a different size than data
    pointers, then a conversion from \texttt{void(*)()} to \texttt{void*} and
    back may not give the original value.

    The main difference in software, however, is at the compiler level.  CISC
    architectures, while complex at the hardware level, allow for significantly
    simpler compiler design.  C code would often map fairly closely to the
    resulting assembly thanks to the addressing modes and direct operations on
    memory.  For instance, the C statement \texttt{a = a * b;} would likely map
    to a single CISC instruction, as opposed to roughly four RISC

    On the other hand, RISC systems typically offer significantly more
    registers.  This, combined with the fewer addressing modes, lead to a focus
    on optimal register allocation and minimisation of load/store operations.
    RISC CPUs are also less likely to have features such as
    out-of-order execution, meaning these optimisations have to be performed at

    The benefits of assembly providing high-level constructs also became less
    significant as languages with a non-imperative style gained prominence.  The
    translation step necessary for RISC architectures became small compared to
    the translation step necessary for a conversion from a functional to an
    imperative style, and advances in compiler design solved much of these

    Continuing down to the assembly level, we come across what could be called
    the defining difference between CISC and RISC: a RISC instruction either
    performs a computation or performs a memory access, but not both.  This is
    generally called the load/store architecture, and from it follow many other
    convention differences.  A load/store architecture does not support many of
    the addressing modes that are generally present on a CISC system, and with
    much fewer instructions to access memory, it is easier to make the duration
    of instructions match more closely.  Fewer addressing modes also means
    simpler instructions, and it becomes possible to efficiently make the size
    of the instructions uniform\cite{the-post-risc-era}.

    % Tjark: Programmers

    Finally, we can talk about RISC and CISC on a hardware level.  The most
    obvious difference between the two is the size of the decoder: the numerous
    and variable-length instructions a CISC architecture generally has require
    significantly more work to decode than the few standard-format instructions
    of a RISC one.

    As we have seen before, RISC systems often use the reduced decoder size in
    order to fit more registers.  Reduction of power usage and heat production
    are also common goals, leading to a preference for RISC in mobile devices.
    We have also mentioned the preference of RISC for a Harvard Architecture;
    this allows for a separate instruction and data cache, permitting
    simultaneous access to both.

    As RISC instructions tend to be simpler and, more importantly, have fewer
    side effects, they are substantially easier to pipeline.  Superscalar
    execution, the execution of multiple instructions at once, was initially a
    RISC feature; however, the performance benefits it provided lead to it being
    adopted across the board.

    The cross-adoption of CISC and RISC features is a common matter.  While RISC
    and CISC started out as significantly different systems from a hardware
    point of view, advances in CPU design resulted in many features that lead to
    the initial differences becoming less and less noticeable.  While the
    smaller decoder in RISC systems was a significant distinction in the late
    seventies and early eighties, new components such as caches and floating
    point units made the initial differences much less significant.

    In section~\ref{combination-cisc-risc} we will see the result of these
    changes, and how they lead to the architectures of today, which generally
    contain both RISC and CISC features.


    % Tjark: Performance

    Another key factor for performance is memory usage.  Initially, it would
    seem like CISC is clearly the winner here: the ability to do more with fewer
    instructions would imply smaller code footprints and thus less memory usage,
    with all the benefits from the above.  The more advanced addressing modes
    could also allow the same result in fewer operations; for instance, a
    pointer to pointer addressing mode would make the storage of the
    intermedient pointer unnecessary.

    Initially this difference was significant, but the benefit diminished due to
    lack of compiler support: many were not advanced enough to take advantage of
    CISC features, and thus compiled code for CISC machines was not
    significantly smaller.  CPU developers reacted by optimising for the
    instructions generally used by compilers, and so while the CISC parts were
    kept for backwards compatibility, the fastest code was often similar to RISC

    With the advance of caching technology, the extra cost in transistors that
    CISC incurred was often an issue, as it meant that a CISC CPU could not also
    feature an on-chip cache.  This gave RISC architectures a performance edge;
    however, this was not directly caused by any feature of RISC itself, only by
    the cache\cite{myth-and-reality}.  Once transistor count increased and CISC
    architectures could support an on-chip cache, this performance factor

    \chapter{Recent and Future Developments}

    % ...
    % ...
    % ...

    \section{Combination of CISC and RISC}

    % ...
    % ...
    % ...

            \emph{The Manchester Small Scale Experimental Machine}.\\
            University of Manchester, 1998
            Harold "Bud" Lawson and Howard Bromberg,\\
            \emph{The World's First COBOL Compilers}\\
            Stanford University, 1997
            \emph{8086 16-BIT HMOS Microprocessor}\\
            Intel, 1990
            John Morris,
            \emph{The Anatomy of Modern Processors}\\
            University of Auckland, 1999
            Michael Steil,\\
            \emph{Dynamic Re-compilation of Binary RISC Code for CISC Architectures}\\
            Technische Universit\"at M\"unchen, 2004
            Yuan Weit et al.,\\
            \emph{RISC vs CISC}\\
            University of Virginia
            Farhat Masood,\\
            \emph{RISC and CISC}\\
            National University of Sciences and Technology
            \emph{RISC vs. CISC: The Post-RISC Era}\\