# Overview

Atlassian Sourcetree is a free Git and Mercurial client for Windows.

Atlassian Sourcetree is a free Git and Mercurial client for Mac.

# FTensor

FTensor is a set of C++ classes that allow a great deal of abstraction when dealing with tensors, yet delivers uncompromising efficiency. It uses template expressions to provide expressiveness and speed.

FTensor is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. A copy of the license should be included in the file LICENSE.

FTensor uses template expressions to optimize code. I owe a huge debt to Todd Veldhuizen who originally used template expressions in developing the Blitz library (see www.oonumerics.org). FTensor uses many of the ideas from that library.

FTensor's biggest claim to fame is that it handles implicit summation. Thus you can write

A(i,j) = B(i,k)*C(k,j)

instead of having to write

A = sum(B(i,k)*C(k,j),k)

Also, the result is strongly typed by the indices, so you can't write

A(i,k) = B(i,k)*C(k,j)

or even

A = B(i,k)*C(k,j)

It has `Tensor0`

, `Tensor1`

, `Tensor2`

, `Tensor2_symmetric`

, `Dg`

(Rank 3 symmetric on the first two indices), `Tensor3_antisymmetric`

(antisymmetric on the last two indices), `Christof`

(Rank 3 symmetric
on the last two indices), `Ddg`

(Rank 4 symmetric on the first two,
and last two, indices), `Riemann`

(Rank 4 antisymmetric on the first
two, and last two, indices, and symmetric under cyclic permutation of
the last three indices) and `Tensor4`

. This eclectic choice of tensors
happened because the library was originally developed for a General
Relativity code.

The dimension of the tensors are determined by a template parameter. You can construct a Tensor like this.

Tensor1<double,5> T1 {0,1,2,3,4};

The order of the elements in the constructors follow row-major convention.
For example in a `Tensor3`

, data elements are as follow:

Tensor3<int,N,M,P> {d000, d001, ..., d00P, d010, d011, ..., d0MP, d100, d101, ..., dNMP}

If you want to turn on bounds checking (so
that, for example, `T1(5)`

will give you a run time error) then compile
everything with `FTENSOR_DEBUG`

defined.

It can handle pointers. So you could write

double a0[10000], a1[10000], a2[10000], b0[10000], b1[10000], b2[10000], c0[10000], c1[10000], c2[10000]; Tensor1<double*,3> A(a0,a1,a2), B(b0,b1,b2), C(c0,c1,c2); Index<'i'> i; for(int a=0;a<10000;a++) { A(i)=B(i)+C(i); ++A; ++B; ++C; }

If you are familiar with Blitz, it also uses template expressions to optimize code. However, Blitz optimizes one expression at a time. So, for example, if you want to invert a 3x3 matrix, you can write it like

det=a(0,0)*a(1,1)*a(2,2) + a(1,0)*a(2,1)*a(0,2) + a(2,0)*a(0,1)*a(1,2) - a(0,0)*a(2,1)*a(1,2) - a(1,0)*a(0,1)*a(2,2) - a(2,0)*a(1,1)*a(0,2); inverse(0,0)= (a(1,1)*a(2,2) - a(1,2)*a(1,2))/det; inverse(0,1)= (a(0,2)*a(1,2) - a(0,1)*a(2,2))/det; inverse(0,2)= (a(0,1)*a(1,2) - a(0,2)*a(1,1))/det; inverse(1,1)= (a(0,0)*a(2,2) - a(0,2)*a(0,2))/det; inverse(1,2)= (a(0,2)*a(0,1) - a(0,0)*a(1,2))/det; inverse(2,2)= (a(1,1)*a(0,0) - a(1,0)*a(1,0))/det;

However, `det`

is just going to be thrown away at the end. We don't
need to store it for all (10,000 or whatever) points. We just need to
compute it for one point, use it in six expressions, and forget it.
The Blitz method makes you ship the memory of `det`

in and out of the
cache 6 times. A better way to do this is to put the whole inversion
into one loop. I've seen a factor of 4 improvement doing it this way.
The disadvantages, which are all-too-real, are that you have to
manually start the loop, and you have to remember to increment the
variables. In the case of the inversion, it ends up looking like

double det; for(int i=0;i<10000;i++ { det=a(0,0)*a(1,1)*a(2,2) + a(1,0)*a(2,1)*a(0,2) + a(2,0)*a(0,1)*a(1,2) - a(0,0)*a(2,1)*a(1,2) - a(1,0)*a(0,1)*a(2,2) - a(2,0)*a(1,1)*a(0,2); inverse(0,0)= (a(1,1)*a(2,2) - a(1,2)*a(1,2))/det; inverse(0,1)= (a(0,2)*a(1,2) - a(0,1)*a(2,2))/det; inverse(0,2)= (a(0,1)*a(1,2) - a(0,2)*a(1,1))/det; inverse(1,1)= (a(0,0)*a(2,2) - a(0,2)*a(0,2))/det; inverse(1,2)= (a(0,2)*a(0,1) - a(0,0)*a(1,2))/det; inverse(2,2)= (a(1,1)*a(0,0) - a(1,0)*a(1,0))/det; ++a; ++inverse; }

Forgetting to put in the ++ operators could result in subtle bugs. You could also have problems if you put in more than one loop:

for(int i=0;i<10000;i++) { a(i,j)=... ++a; } for(int i=0;i<10000;i++) { a(i,j)+=... ++a; }

This will end up writing off of the end of a. Furthermore, I use the
`restrict`

keyword, so you might get some weird problems if you try to
alias things. You might want to `#define`

the `restrict`

away. I found
that it actually decreased performance for extremely complicated
expressions.

Basically, you're giving up some expressive power.

It can handle quite complex expressions. As a real life example

K_new(i,j)=Lapse*(R(i,j) + Trace_K*K(i,j) - (2*K(i,k)^K_mix(k,j)) - 0.5*matter_ADM*g(i,j) - S_ADM(i,j)) + Shift_up(k)*dK(i,j,k) + (K(i,k)*dShift_up(k,j) || K(j,k)*dShift_up(k,i)) - ddLapse(i,j) + (dLapse(k)*christof(k,i,j));

`K_new`

is symmetric, the ^ operator means contract to make a
`Tensor2_symmetric`

, and the || means add to make a `Tensor2_symmetric`

(it is not a symmetrizer, so it doesn't divide by 2). I had to use
these operators (instead of * and +) to keep the compilers from making
a `Tensor2`

instead. You can't assign a `Tensor2`

to a `Tensor2_symmetric`

,
so you have to explicitly request the symmetrized result.

Some compilers were able to optimize the entire expression. The page at

http://www.wlandry.net/Projects/FTensor

has a more complete discussion of compilers and performance.

There is a directory for tests with a README that should explain how to use them.

If you have any questions, feedback, bug reports, feature requests, etc., feel free to send me an email.

Enjoy, Walter Landry wlandry@caltech.edu