From b98e1901f5d7035cd121b7c71b31a78afa89afc3 Mon Sep 17 00:00:00 2001 From: Francis Russell Date: Wed, 18 Apr 2012 17:20:26 +0100 Subject: [PATCH] Presented version. --- presentation.tex | 63 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 61 insertions(+), 2 deletions(-) diff --git a/presentation.tex b/presentation.tex index 0c4e249..a5ce7ba 100644 --- a/presentation.tex +++ b/presentation.tex @@ -11,10 +11,10 @@ %\input{pygments.sty} \title[ONETEP - DSLs and GPUs]{ONETEP - Code generation from a Quantum Chemistry -DSL / GPU Parallelisation} +DSL} \subtitle{PSL Presentation} -\author[K. Wilkinson \& F. Russell]{Karl Wilkinson \& Francis Russell \\ Joint work with Chris-Kriton Skylaris \& Paul Kelly} +\author[F. Russell]{Francis Russell \\ Joint work with Chris-Kriton Skylaris \& Karl Wilkinson} \date{18/04/2012} \institute[ICL \& Soton]{Imperial College London \& Southampton University} @@ -167,6 +167,33 @@ output is FortranFunction("integrals_kinetic", ["kinet", \frame{ +\frametitle{Mapping indices to real code} + +\begin{itemize} + +\item We don't expose position indices in our DSL since these are +related to the choice of discretisation. However we use them internally +for ONETEP code generation. + +\item When we use DSL indices to index multiple operands, +we are forcing the compiler to generate code that can iterate over all +the operands simultaneously. + +\item The mapping between DSL indices and loop-indices in generated code +is complex. Loops in the generated code may correspond to zero or more DSL +indices. + +\item The code we want to generate is primarily related to control flow and data +movement. The most computationally intensive part of the kinetic energy +calculation is the Fourier transform, and we intend to call a black-box function +for that anyway. + +\end{itemize} + +} + +\frame{ + \frametitle{Implementation Challenges: Interfacing with ONETEP} We do not have a well-defined abstract interface to interact with ONETEP. We @@ -259,10 +286,42 @@ manipulation of spatial indices, support reduction operations etc. \frame{ +\frametitle{Current Status} + +\begin{itemize} + +\item Our compiler can parse expressions in our DSL. + +\item Using a relatively abstract description of the ONETEP data +structures, it constructs representations of the iterations and +predicates required to generate the points required from a particular +operand. + +\item Inferring an efficient loop structure from this when indices are +bound together is a work in progress. + +\item Thie approach captures data dependency well, but not anything +involving mutable state. This is needed for allocating and populating +the array passed to the black-box Fourier transform implementation. + +It will also be necessary for correctly reasoning about the +communication required for MPI code generation. + +\end{itemize} + +} + +\frame{ + \frametitle{The Future} \begin{itemize} +\item{\bf Inferring data movement.} ONETEP already has MPI +parallelisation. We would like to be able to automatically reason about +data transfers, exploring the possibility of improving scalability. +Similarly, this is required if we wish to generate code targetting GPUs. + \item{\bf A mechanism for systematically exploring different code generation variants.} If we can generate a search space of valid implementations, we can work on defining an objective function that lets us choose the best -- 2.47.3