Archive

Posts Tagged ‘code generator’

Ein (Einstein Index Notation) – A Domain Specific Language (DSL) Interpreter for Index Notation

January 11th, 2011 No comments

Today I have unveiled a small project I’ve been working on and off (mainly off) for the past 9 months. If you don’t care about the steps, but are interested in the implementation and current results of the language skip down to the Language Information section.

Ein Developement History:

The idea for creating a DSL mainly formulated back in April of 2010, when I participated in a small Scala DSL-based hackathon at Stanford. The language was called Liszt, after the music composer. It is currently being developed (hopefully) by some Ph.d. students and aims to create a DSL for the purpose of reusing as much code as possible when migrating from single-core to multi-core/multi-GPU-core/cluster architectures. The idea was to have the compiler figure out all the nifty multi-core/MPI/CUDA details and the coder could simply concentrate on writing some sort of “higher-level” code. As a person who deals with numerical computation in the general area of continuum mechanics , biomechanics, and finite elements, I jumped on board as an early test subject to see if I could leverage any of it to my advantage.

During the Liszt hackathon, some rules were enforced, such that we were not allowed to use iterators. (I assume the rules have something to do with the implementation of Liszt a year ago). This made multiplying matrices, and so forth much more difficult. In fact, writing a small finite element routine is quite tedious with iterators or iterable constructs (such as for-loops, and etc). Because of this restriction, I realized that even with Matlab, high-order tensors necessitated using for loops and there were no simple constructs for  handling operations such as double contractions in Matlab cleanly. (Yes, I do realize you could do something like sum(matrix1.*matrix2), but that has more to do with utilizing underlying knowledge of Matlab data structures than with Matlab being designed for index notation. )

At any rate, Einstein Index Notation, is something used quite commonly in the area of mechanics and is named after its “inventor”. It’s based on several important conceptual notations, but the practical reasoning behind the notation is in implied summations over repeated indices (aka. dummy indices). With the exception of the assembly operator and some other operators related to boundary conditions, finite elements are formulated and then coded directly using index notation.

After listening to several seminar lectures before April 2010 about Liszt, I became interested in Scala and used it to generate some object oriented meshes for a group of randomized finite element simulations for research. After my frustration with the initial version of Liszt, I explored the interwebz and found some tutorials/blogs about implementing internal and external DSLs in Scala. I started writing an indicial notation internal DSL, but realized that I actually needed to write a Abstract Syntax Tree (AST) for my language because of sub expression in parenthesis and such. I started writing my own, but realized that I didn’t really have a solid understand of Scala case classes, and Scala Parser combinators were quite daunting, as I had only used Scala in a functional programming paradigm for the aforementioned mesh generator. Luckily I found this project called Kiama, which “is a Scala library for language processing”. Luckily, I found an example called Imperative, which shows how to parse a simple mathematical language described by a custom AST. This gave me enough time and incentive to play around with Scala case classes and parser combinators.

After setting up and designing the AST for my language back in June, due to research deadlines, I went on a development haitus until Xmas, when I decided to take a break from research. Implementing index notation rules was more difficult that I imagined, since humans make lots of checks and can naturally translate notation to code.

The following is a list of things that had to be checked, or were things I never thought of as a mere human:

  1. Checking for repeated indices only show up twice.
  2. Checking Indicial ranges for possible IndexOutOfBoundException.
  3. Checking for mismatched or indices that would not result in the proper expressions
  4. Generating code from an AST
  5. Generating code optimizations due to symmetry of tensors (Voigt notation)

These things actually posed sufficient problems and resulted in a first attempt of 827 lines of Scala code. The code is currently somewhat messy, and a little hackish in terms of elegantly generating code and only generates C type languages (although it is designed to handle fortran, Matlab, and Python conventions). I will the language under the Simplified BSD license, although kiama is of course licensed under LGPL/GPL.

Language Information:

As noted before, Einstein Index Notation, is named after Albert Einstein. Since the first 3 letters of Einstein Index Notation are also the first three letteres of Einstein, the punny engineer in me has decided to name this tiny simple external Scala DSL as Ein.  Although the language does not account for contravariant and covariant indices and should therefore be named Index Notation (In), as “in”, is an English word, Ein seems like a better choice. (I apologize to Germans, since Ein is a German word, but since I’m the creator too bad.)

There are only a handful of statements that Ein currently supports:

  1. Set [T] = {dim1,dim2,…dimn}; : This defines the dimensions of tensor [T]. The length of the dimensions is its order.
  2. [A] = expr; : This assigns a tensor [T] to the given expression.
  3. env; : This displays the current information known by Ein about defined tensors.

Tensors are represented as follows:

identifier {indices} : identifier is the “name” of the Tensor. indicies are optional unless the expression necessitates indices (aka during Set statements they are not needed, nor for scalar tensors). Currently there are only two representations of indices and both only accept one character alphanumeric characters. Numbers should be between 1 and the specified or assumed dimension, if using repeated indices.

  1. Normal indices are just single characters and go between curly braces, i.e. T{i,i} for the trace of T.
  2. Voigt indices are specified within brackets, i.e. T{[i,i]} for the trace of T, where T is assumed by Ein to be symmetric. Ein does not check for symmetry.

Armed with this knowledge, one can start tapping out hopefully many Ein expressions. Below is a bit of code and some explanation about some generated test cases I have used.

Enter imperative language programs for parsing.
Ein> Set A = {3,3};
Size of A is List(3, 3)
Ein> p = A{i,i};
for (i=0;i<3+0;i++){ p +=A[i][i]; } Ein> Set x = {3};
Size of x is List(3)
Ein> b{i} = A{i,j}*x{j};
for (i=0;i<3+0;i++){
for (j=0;j<3+0;j++){
b[i] +=A[i][j]*x[j];
}
}
 
Ein> a{i} = b{i} + x{i};
for (i=0;i<3+0;i++){
a[i]+=b[i];
a[i]+=x[i];
}
 
Ein> d{i} = p*a{i} + A{i,k}*b{k};
for (i=0;i<3+0;i++){
d[i]+=p*a[i];
for (k=0;k<3+0;k++){
d[i] +=A[i][k]*b[k];
}
}

vg, vi, and vj are predefined matrices and arrays, that I have not explicitly printed. vg is a matrix where each element maps to it’s numbering in Voigt notation. While there are several different arbitrary ways of numbering elements using Voigt notation, I will use the following for a 3×3 matrix, T:

  • the diagonal is numbered 1,2,3
  • T(1,2) = 4, T(1,3) = 5, T(2,3) = 6, and corresponding symmetrical elements are also thusly defined.

vi and vj simply correspond to the mapping from Voigt index representation to the (i,j) representation of a matrix. (Rows, Columns).

  • vi = 1,2,3,1,1,2
  • vj = 1,2,3,2,3,3

The following examples employ Voigt optimizations:

Ein> Set F{} = {3,3};
Size of F is List(3, 3)
Ein> C{[i,k]} = F{j,i}*F{j,k};
for (i=0;i<6+0;i++){
for (j=0;j<3+0;j++){
C[i] +=F[j][vi[i]]*F[j][vj[i]];
}
}
 
Ein> CC = C{i,j}*C{[i,j]};
for (i=0;i<3+0;i++){
for (j=0;j<3+0;j++){
CC +=C[vg[i][j]]*C[vg[i][j]];
}
}
 
Ein> Csquared{[i,k]}=C{i,j}*C{j,k};
for (i=0;i<6+0;i++){
for (j=0;j<3+0;j++){
Csquared[i] +=C[vg[vi[i]][j]]*C[vg[j][vj[i]]];
}
}
 
Ein> Set kron = {[3,3]};
Size of kron is List(3, 3)
Ein> E{[i,j]}=0.5*(C{ij}-kron{ij});
for (i=0;i<6+0;i++){
E[i]+=0.500000*C[i];
E[i]-=0.500000*kron[i];
}
 
Ein> env;
Tensor	DimSize	ActualSize
*******************
CC	List(1)	List(1)
d	List(3)	List(3)
a	List(3)	List(3)
A	List(3, 3)	List(3, 3)
F	List(3, 3)	List(3, 3)
kron	List(3, 3)	List(6)
C	List(3, 3)	List(6)
p	List(1)	List(1)
b	List(3)	List(3)
Csquared	List(3, 3)	List(6)
x	List(3)	List(3)
E	List(3, 3)	List(6)

To test this yourself, I have included a compiled version of Ein and Kiama. The following line can be used to run the Ein Interpreter:

scala -classpath kiama_2.8.0-1.0.0.jar:ein110111.jar net.unoc.hawflakes.ein.Interpreter

Feedback

Feedback regarding bugs in index notation implementation, feature requests, and general constructive comments/questions are welcome. Code isn’t shown, since the code is a little messy and not as good as I ideally would like it. Please leave them in the moderate comments below.