HTTPS SSH
Push changes via hg push https://sudiptac@bitbucket.org/sudiptac/multi-core-chronos


This package contains the initial version of Chronos multi core WCET analysis tool.

The package has been tested on a 32 bit machine running ubuntu 8.10 operating system.

===========================
Source code and benchmarks
===========================

----chronos-multi-core-final

This directory contains the WCET analyzer targetting simplescalar PISA. The source code is contained in chronos-multi-core-final/est directory.

----CMP_SIM

This directory contains the multi-core simulator supporting shared cache and shared TDMA bus. Target architecture is simplescalar PISA 
(i.e. same as the WCET analyzer). Source code of the simulator can be found in CMP_SIM/simplesim-3.0 directory.

----benchmarks

Contains precompiled binaries of sample benchmarks targetting simplescalar PISA. Each subdirectory say, <bm> in the benchmarks directory 
contains the following relevant files
	----<bm>.c : 
		C source file of the benchmark
	
	----<bm> :  
		Simplescalar PISA binary of <bm>
	
	----<bm>.cfg : 
		Internal representation of the <bm> control flow graph
	
	----<bm>.loopbound : 
		Loop bound specifier. Each line contains the loop bound of a specific loop in the source file in following format:
			<procedure id>	<loop id>  <relative loop bound>	
		where <procedure id> represents the id of the procedure in source code order. Similarly, <loop id> represents the 
		relative id of the loop (inside procedure) in source code order. 
	
	----<bm>.dis : Disassembled file from <bm> binary.
	
	----<bm>.cons : user specified ILP constraints for helping WCET analysis process.
	
	----<bm>.lp : ILP problem formulated to solve the WCET of <bm>. This <bm>.lp file can be solved by lp_solve.
	
	----<bm>.ilp : Same as <bm>.lp but with CPLEX support. If you use CPLEX, the input file should be <bm>.ilp instead of <bm>.lp
	
----lp_solve 

This is a freely available LP solver which we used for our experiments. LP solver is used to get the WCET value, which is formulated 
as an ILP problem. "lp_solve/lp_solve" is the binary which invokes the solver. Although, we strongly recommend using commercial solver 
CPLEX, as we found it much faster than lp_solve.

===============================
Download, compile and install
===============================

Download the package "chronos-mc-v1.tgz" and extract it in a directory say $IDIR:

cmd> tar -xzvf chronos-mc-v1.tgz

After extracting, it will produce four directories as follows:

----chronos-multi-core-final
----CMP_SIM
----benchmarks
----lp_solve

Go inside the "chronos-multi-core-final" directory and execute the following commands to build 
the WCET analyzer: 

cmd> ./autotools.sh 

(Note: The above command may require additional software in your systems (e.g. automake). 
If some error comes out, please make sure you install all the required softwares first 
and then re-invoke the above command).

configure: 
-----------------------

cmd> ./configure --with-lp-solve=<Full path to lp_solve directory>

(Note: The above command set up all the necessary environments and generate the Makefile. 
It only require the full path to lp_solve directory as an input. If you have untarred 
the package in a directory say $IDIR, the following command should go smooth: 

cmd> ./configure --with-lp-solve=$IDIR/lp_solve)

Compiling the analyzer:
------------------------
cmd> make 

If the compilation goes successful, it would produce a binary called "est" inside 
chronos-multi-core-final/est. This is the WCET analyzer.

Check whether everything is all right:
----------------------------------------
Execute the following sample test from chronos-multi-core-final directory:

cmd> ./est/est -config processor_config/exp_cache/processor_l2.opt task_config/exp_cache/tgraph_mrtc_est_2

If the above command executes successfully and gives a valid WCET value, then you have installed 
the analyzer successfully.

Compiling the simulator:
-------------------------
Go inside directory CMP_SIM/simplesim-3.0 and build the simulator:

cmd> make

If the compilation goes successful, it will produce a binary "sim-outorder" which is used for simulation.

checking the simulator:
-------------------------

Execute the following command from CMP_SIM/simplesim-3.0 directory to test the simulator:

cmd> ./sim-outorder -sconfig sim-test/exp_cache/processor.opt sim-test/exp_cache/cnt.arg sim-test/exp_cache/jfdcint.arg

If executing the above command produces valid simulation results at end, then you have successfully 
installed the multi core simulator.


=============================
Running the WCET analyzer:
=============================

From the directory "chronos-multi-core-final", you can run the run the chronos multicore analyzer using the following 
command:

cmd> ./est/est -config <processor_config> <task_config>

where <processor_config> is the file providing micro-architectural configuration and <task_config> is the file 
providing task mapping to different cores. Please follow the description below which describes the format of 
these files:

processor_config:
-------------------
processor_config describes the micro-architectural configurations. The micro-architectural configuration follow 
the same format as simplescalar except a few modifications introduced to handle multi-core specific features. 
A typical processor configuration file looks as follows (description shown beside after # sign):

-cache:il1 il1:16:32:2:l 		# 2-way associative, 1 KB L1 cache
-cache:dl1 none 					# perfect L1 data cache
-cache:il2 il2:32:32:4:l 		# 4-way associative, 4 KB L2 cache
-cache:il2lat 6					# L2 cache hit latency = 6 cycles 
-mem:lat 30 2 						# memory latency = 30 cycles
-bpred 2lev							# 2 level branch predictor
-fetch:ifqsize 4					# 4-entry instruction fetch queue
-issue:inorder true 				# inorder processor
-decode:width 2 					# 2-way superscalar
-issue:width 2 					# 2-way superscalar
-commit:width 2 					# 2-way superscalar
-ruu:size 8							# 8-entry reorder buffer 
-il2:share 2						# 2 cores share an L2 cache
-core:ncores 2						# total number of cores = 2
-bus:bustype 0						# TDMA round robin shared bus
-bus:slotlength 50				# bus slot length assigned to each core = 50 cycles

Note that except the last four paramters, all other parameters are identical to simplescalar. A more detailed 
description of the parameters can be found by running the sim-outorder without any input. 

Last four parameters are introduced for multi-core WCET analysis and are detailed as follows:

-core:ncores    

Total number of cores in the processor, default value is 1

-il2:share

Total number of cores sharing an L2 cache. Default value is 1. Therefore providing only "-core:ncores 2"  
does not mean that the two cores share an L2 cache. We need to additionally provide "-il2:share 2" (as 
shown in the example above). Some examples of using the above two arguments:
	
	-core:ncores 2 , -il2:share 1 : 2 cores with private L2 caches
	-core:ncores 2 , -il2:share 2 : 2 cores with shared L2 cache
	-core:ncores 4 , -il2:share 2 : 4 cores with a group of two cores sharing an L2 cache

-bus:bustype	

Default value is -1 which means a perfect shared bus and introduces zero bus delay for any possible bus 
transaction. 

If the value is 0 (as in the example), it resembles a round robin TDMA bus where a fixed length bus slot 
is assigned to each available core. 

-bus:slotlnegth

Only relevant if -bus:bustype is set to 0. Represents the bus slot length assigned to each core in round 
robin bus schedule.


Example processor configurations:
-----------------------------------
Numerous processor configuration examples are provided in directories chronos-multi-core-final/processor_config/exp_*. 
Those can be tried for running example programs.


task_config:
-------------
task_config represents the task names and their mapping to different cores. This file is viewed by the analyzer 
front end as a task graph. The abstract syntax of any <task_config> is as follows: 

<Total number of tasks>
<Path to task 1>
<priority of task 1> <assigned core to task 1> <MSC id to task 1> <list of successor task ids of task 1 according to the partial order>
<Path to task 2>
<priority of task 2> <assigned core to task 2> <MSC id to task 2> <list of successor task ids of task 2 according to the partial order>
.........(for all the tasks)

Therefore, each 2 lines in the task_config file represent the task name and its parameters (priority, assigned cores etc).

Although, the front end supports reading the file in a task graph format, we currently do not have any support for 
scheduling task graph. We currently derive the WCETs of interfering tasks in different cores. Therefore, for the 
current release of the tool, only the following parameters are important: 

<Total number of tasks> : Total number of tasks running on different cores
<path to task i> : Relative or full path name pointing to the simplescalar PISA binary of the task.
<assigned core to task i> : The assigned core number to the task

Following is an example of <task_config> file:

2
../benchmarks/cnt/cnt
0 0 0 0
../benchmarks/jfdcint/jfdcint
0 1 1 0

The above task_config file states that we are running "../benchmarks/cnt/cnt" on core 0 and "../benchmarks/jfdcint/jfdcint" on 
core 1 concurrently.

Example task configurations:
-----------------------------------
Numerous task configuration examples are provided in directories chronos-multi-core-final/task_config/exp_*. 
Those can be tried for running example programs. Nevertheless, it should verified whether the path names 
of the benchmarks/tasks are provided correctly in the task_config file.

(Note: MSC id is the message sequence chart id. It is mainly retained for some compatibility reason and can be removed later. 
MSC id can be put same for each task (say 0). List of successor task ids are the list of integers representing different 
task identities.)


=============================
Running the Simulator:
=============================
From the directory "CMP_SIM/simplesim-3.0", you can run the run the multicore simulator using the following command:

cmd> ./sim-outorder -sconfig <processor_config> <arg_file 1> <arg_file 2> .....

Above command runs the simulation of task provided by <arg_file 1> in core 0, task provided by <arg_file 1> in core 1 
and so on with some shared configurations given through <processor_config> (explained below in more detail).

Here <arg_file 1>.... represents the tasks running on a dedicated core and the configurations of the corresponding core. 
A typical arg file looks like this: 

../../benchmarks/cnt/cnt > sim-test/exp_cache/cnt.out
-issue:inorder true
-issue:wrongpath false
-bpred 2lev
-mem:width 32
-ruu:size 8
-fetch:ifqsize 4
-cache:il1 il1:16:32:2:l
-cache:il2 il2:32:32:4:l
-cache:il2 none
-cache:dl1 none
-cache:dl2 none
-cache:il2lat 6
-mem:lat 30 2
-decode:width 1
-issue:width 1
-commit:width 1
-dtlb none
-itlb none

The first line of the above arg file provides the benchmark name running on the core (i.e. ../../benchmarks/cnt/cnt). The 
benchmark name must point to the simplescalar PISA binary of the corresponding benchmark (check that path). The detailed 
output of the simulation will be redirected to the file "sim-test/exp_cache/cnt.out". The simulation also provide the 
output in standard output. Note that to compare result with WCET analyzer, comparison must be made with "effective" cycles 
as the effective cycles ignore the time spent in libraries (so as the WCET analyzer).

Rest of the lines in arg file resembles the micro-architectural configuration of the core which has exactly similar format 
with single core simplescalar release. 

<processor_config> file provides (given with -sconfig option) the shared configurations among cores. Following is an 
example:

-core:ncores 2			# total number of cores
-bus:bustype 0			# TDMA round robin bus
-bus:slotlength 50	# bus slot length assigned to each core
-il2:share 2			# number of cores sharing an instruction L2 cache
-dl2:share 2			# number of cores sharing an data L2 cache

Note that the above parameters have the same interpretation as in the multi core WCET analyzer. 


Example simulations:
----------------------
Numerous examples have been provided in directory CMP_SIM/simplesim-3.0/sim-test/exp_*. Check out the "run_one" script 
in directories CMP_SIM/simplesim-3.0/sim-test/exp_* to get some sample commands for simulations.

============
Contact
============
This code is no longer maintained. However, if you need help in running/compiling the code, 
please feel free to contact Sudipta Chattopadhyay (sudiptaonline@gmail.com)


Known issues: 
===============

1.Memory requirement for offset propagation. Can be fixed in future. 

2.LP solver stucks sometime when branch prediction is turned on. 
However, we have verified that the problem is with the solver. 
CPLEX finishes solution within seconds. Our code now hardcodes 
LP solver in the code. CPLEX is recommended. Someone else can 
change the code to do this :-)

3.For sensitivity result, I need to disable 1 or 2 benchmarks when 
speculation is turned for issue in (2). However, it genuinely 
represents the average result ober at least 11-12 benchmarks.