%\input{pygments.sty}
\title[ONETEP - DSLs and GPUs]{ONETEP - Code generation from a Quantum Chemistry
-DSL / GPU Parallelisation}
+DSL}
\subtitle{PSL Presentation}
-\author[K. Wilkinson \& F. Russell]{Karl Wilkinson \& Francis Russell \\ Joint work with Chris-Kriton Skylaris \& Paul Kelly}
+\author[F. Russell]{Francis Russell \\ Joint work with Chris-Kriton Skylaris \& Karl Wilkinson}
\date{18/04/2012}
\institute[ICL \& Soton]{Imperial College London \& Southampton University}
\frame{
+\frametitle{Mapping indices to real code}
+
+\begin{itemize}
+
+\item We don't expose position indices in our DSL since these are
+related to the choice of discretisation. However we use them internally
+for ONETEP code generation.
+
+\item When we use DSL indices to index multiple operands,
+we are forcing the compiler to generate code that can iterate over all
+the operands simultaneously.
+
+\item The mapping between DSL indices and loop-indices in generated code
+is complex. Loops in the generated code may correspond to zero or more DSL
+indices.
+
+\item The code we want to generate is primarily related to control flow and data
+movement. The most computationally intensive part of the kinetic energy
+calculation is the Fourier transform, and we intend to call a black-box function
+for that anyway.
+
+\end{itemize}
+
+}
+
+\frame{
+
\frametitle{Implementation Challenges: Interfacing with ONETEP}
We do not have a well-defined abstract interface to interact with ONETEP. We
\frame{
+\frametitle{Current Status}
+
+\begin{itemize}
+
+\item Our compiler can parse expressions in our DSL.
+
+\item Using a relatively abstract description of the ONETEP data
+structures, it constructs representations of the iterations and
+predicates required to generate the points required from a particular
+operand.
+
+\item Inferring an efficient loop structure from this when indices are
+bound together is a work in progress.
+
+\item Thie approach captures data dependency well, but not anything
+involving mutable state. This is needed for allocating and populating
+the array passed to the black-box Fourier transform implementation.
+
+It will also be necessary for correctly reasoning about the
+communication required for MPI code generation.
+
+\end{itemize}
+
+}
+
+\frame{
+
\frametitle{The Future}
\begin{itemize}
+\item{\bf Inferring data movement.} ONETEP already has MPI
+parallelisation. We would like to be able to automatically reason about
+data transfers, exploring the possibility of improving scalability.
+Similarly, this is required if we wish to generate code targetting GPUs.
+
\item{\bf A mechanism for systematically exploring different code generation
variants.} If we can generate a search space of valid implementations, we
can work on defining an objective function that lets us choose the best