Project 2: Intra-procedural path tracing

2.1 Background

This project is about implementing the Efficient Path Profiling paper. You can find the paper here.

2.2 Implementation

The implementation of the project is in 2 stages.

Modified Ball-Larus algorithm
Block regeneration algorithm

2.2.1 Modified Ball-Larus algorithm

We can briefly summarize the Ball-Larus algorithm as follows:

Generate the Control Flow Graph (CFG) for each method
Modify the CFG to handle loops
Assign weights to the edges based on the bottom-up edge assignment algorithm
Add instrumentation for recording the path sums at the end of each method and at each back edge in the CFG

We modify the above algorithm slightly. Instead of recording the path sums, we output the path sum into the console.

2.2.2 Block regeneration algorithm

In SOOT, we can have unique ID for the basic blocks present in a method. After the execution of the instrumented program, we get a stream of BL identifiers. Each identifier represent a sequence of basic blocks present in the program. As part of post-processing the output produced, replace each BL identifier with the sequence of block IDs produced by SOOT.

2.3 Requirements

This section gives the list of things you need to accomplish to complete this project.

2.3.0 CFG

You will be working with ExceptionalBlockGraph as the CFG for each method. A good graph API can be used to reduce the burden of designing your own data structures and algorithms for graph traversals. JGraphT is a really good graph library for Java. The JAR file has been already included as part of the Eclipse classpath.

2.3.1 Instrumentation

Apply BL algorithm for the CFG of each method to get the edge assignments. Instrument normal edges, back edges and method exit statements with appropriate instructions. Output the path sums to System.out.

NOTE: Only BL path sums should be present in the standard output when executed. The output should not contain any other information like method entry, exit logs, method identifiers and so on.

2.3.2 BL identifier to Block ID mapping

After the instrumentation, obtain a mapping for the set of blocks that a particular BL identifier represents and store it in a temporary file in "sootOutput" directory. You will need this file to regenerate the blocks IDs which were executed.

2.3.3 Block Generation algorithm

After the execution of the modified bytecode from the "sootBin/" folder, you get a output file containing the console output placed in "output/" folder. After this, e0210.ProcessOutput.java will be executed as part of processing the output generated. This step it to replace each BL identifier with the set of blocks it represents using the mapping generated from the step 2.3.2. This modified output file will be placed in the "processed-output" folder. Expected output file will be compared against the output file present in the "processed-output/" folder.

NOTE:

You will need to instrument all the methods in the program. Including the constructors and class initialization methods.
You need to figure out a way to differentiate between the BL IDs of different methods (from algorithm, BL IDs start from 0 for each method).

2.4 Example

Program:

#!java

public class Main {

    public static void main(String[] args) {

        int a = args[0].length();
        if (a > 0)
            foo(a);
        else
            bar(a);

        return;
    }

    static boolean foo(int a) {
        if (a == 1)
            return true;
        else
            return false;
    }

    static boolean bar(int a) {
        if (a == 0)
            return true;
        else
            return false;
    }

}

CFG from ExceptionBlockGraph:

main.dot.png

bar.dot.png

foo.dot.png

Sample Output (BL IDs): This output may differ

Input: 4

#!java

0 - True branch in foo()
1 - True branch in main()

Processed-output (Block IDs):

#!java

0
2 - Block IDs in foo()

0
1
3 - Block IDs in main()

Wiki

e0210-project / Project-2