From: Francis Russell Date: Wed, 28 Nov 2012 17:51:52 +0000 (+0000) Subject: Remove obsolete notes. X-Git-Url: https://git.unchartedbackwaters.co.uk/w/?a=commitdiff_plain;h=33b5bd40dc5b0c20ab44222e9e2c914595e638e6;p=francis%2Fofc.git Remove obsolete notes. --- diff --git a/docs/allocation.txt b/docs/allocation.txt deleted file mode 100644 index a0bd3f6..0000000 --- a/docs/allocation.txt +++ /dev/null @@ -1,37 +0,0 @@ -Problem: we need to be able to persist the output of operands to -temporary buffers. - - -Case examples -------------- - -- Matrix (treated as dense): -1. Both indices have well defined ranges from 1 to n. -2. Size of indices do not depend on each other. - -- Matrix (treated as sparse): -1. Has a block sparse format. -2. Depends on whether we know how to persist each block. -3. Need to store offsets for each block. - -- PPD function set: -1. PPD positions are dependent on all indices up to the ppd index. -2. Tight-box positions are only dependent on function index. -3. Data doesn't have especially well-defined ranges. -4. Max range is bound of simulation cell (but what if multiple -simulation cells?). -5. We need explicit bounding to construct a FFT-box. - - -Storage in a dense array ------------------------- - -1. If we know an index, we don't need to persist storage across it. -2. When we persist storage, we can use known indices to restrict the -size. - -Abstractions breakdown ----------------------- - -1. Spatial indices are typically independent since we cover rectangular -regions. diff --git a/docs/code_generation_strategies.txt b/docs/code_generation_strategies.txt deleted file mode 100644 index b1d891f..0000000 --- a/docs/code_generation_strategies.txt +++ /dev/null @@ -1,116 +0,0 @@ -Some notes to help clarify things ---------------------------------- - -1. Each node in the DSL's expression tree corresponds to a scalar value -with 0 or more indices. - -2. Indices have dependencies on other indices. This will affect the loop -nesting order. - -3. Nodes have dependencies on other nodes. This will affect where in -loops certain expressions are computed. - -4. Nodes that destroy indices imply that information will be passed -outside a loop. This will typically require allocation of storage. - -5. Nodes will currently assume that they get random access to the -iteration space described by the indices they destroy. - -6. Nodes will have a prefix call that is used to set-up any data -structure they need and iterate over their operands - -7. Nodes will have a call that allows access to their data, using their -already defined indices. - -8. Nodes will have a postfix call that performs cleanup of any allocated -data structures. - -Code Generation Strategy ------------------------- - -1. Topological traversal of expression tree. - -2. Only when nodes destroy an index is the loop for that expression -created. - -3. DataSpaces (leaf nodes) are traversed in their native format. - -4. All other iteration spaces will be assumed to be dense, and will -dynamically allocated memory in the dense format. - -5. We would expect that it would be possible to implement an inner -product as requiring O(1) space if the loops are traversed in the -correct order. However this requires that the indices that are not being -reduced over are dense. - -In the case of removing a PPD-index, the x, y & z indices are derived -and also dependent upon the specific PPD. - -The correct strategy would appear to be to destroy and re-create any -indices that are derived (depend?) on the indices being eliminated. - -Code Generation Strategy 2 --------------------------- - -1. Determine indices for all nodes in tree. - -2. Determine which indices are created and destroyed by each node. - -3. By default, we assume that all nodes assign their results to arrays, -which then permit random access by the next operand. - -4. Nodes which share common prefixes can be moved inside the same loops. - -5. Nodes which share common prefixes can have their array temporary -extends reduced. - -6. Nodes have two sets of indices - - Indices representing the size of data they produce. - - Indices representing the loops required to iterate over them. - -7. FFT nodes entirely destroy their spatial indices. - -8. Nodes access data by providing index bindings for all indices they destroy. - -9. Nodes set data by providing index bindings for all indices they create. - -Case Study ----------- - -An inner product between an entirely dense 3D data-set and one stored in PPD format. - -1. Let's say that an inner product doesn't entirely destroy its indices. - -2. Instead, the indices it produces are a composite of its input indices. - -3. Now both inputs generate indices that are not destroyed (sort of). - -4. We can now fuse provided that we can generate loops for the fused -iteration. - -4. However, we only fuse on common prefixes, but the Function Set -has the PPD-iteration loop as its outermost loop. - -5. We chose the common-prefix law in order to avoid redundant -computation of values, but the PPD loop doesn't do this. The subsets the -PPD loop iterates over are disjoint. - -6. Both block matrices and the PPD-representation have these external -sorts of indices. Both iterate over disjoint subsets of the data. It -would be nice to have more examples to fully characterise them. - -7. So when are we allowed to move the PPD iteration loop? Why is this -such an issue? If in the integrals_kinetic example, we were able to move -the PPD iteration loop outside our assignment loop (fixed # of PPDs per -sphere), it would destroy everything. Clearly, it must be constrained -somehow. - -8. The work by Kotlyar et al. focusses on how to construct the set of -tuples required for iterations, but now how they are to be reduced. The -PPD-loop cannot be moved because the reductions are non-trivial for a -number of the quantum chemistry operators (e.g. FFT). - -9. All loop re-arrangements must respect two restrictions: - - Loops cannot move outside the loops they depend on (the obvious one). - - Loops cannot move above a node that destroys them. Why? - It represents some form of implicit reduction over that index? diff --git a/docs/combining_producers_consumers.txt b/docs/combining_producers_consumers.txt deleted file mode 100644 index 32e8c84..0000000 --- a/docs/combining_producers_consumers.txt +++ /dev/null @@ -1,51 +0,0 @@ -How do we combine producers and consumers? ------------------------------------------- - -Assume some producer x generating points: - -Loop A - Loop B - x(A, B, v) - - -Case 1: Consumer y operates point-wise: - -Loop A - Loop B - y(A, B, v) - - -Case 2: Consumer y operates block-wise: - -Loop A - Loop B - y_in(A, B, v) - -y_opaque - -Loop C - Loop D - y_out(C, D, v) - - -In both cases, we can always substitute the point receiver of y into the -body of loop B. However, depending on whether y is point-wise or -block-wise, we may have intermediate code and the output expression may -be either in the original loops, or new generated loops. How do we -generalise the templating scheme for either strategy? - -We cannot just substitute y into x since code in y may exist outside of -y's loops (the block-wise case). We cannot substitute x into y since the -body of y may also need to exist within x's loops (the point-wise case). - -We have two types of granularity when considering producers and -consumers: - -1. The type of object on which an operator conceptually operates. -2. The number of elements each operator needs before it can operate. - -An fftbox operator operators conceptually on blocks, but can be applied -on a point-wise basis since it merely filters point. - -A reciprocal operator (the FFT) requires all points in a spatial block -before it can operate. diff --git a/docs/expression_strategy.txt b/docs/expression_strategy.txt deleted file mode 100644 index e5baae1..0000000 --- a/docs/expression_strategy.txt +++ /dev/null @@ -1,99 +0,0 @@ -Sources provide : - - An access expression - - Expressions for computing offsets (i.e. in physical space) - - Quantifiers across loop index - - Predicates on the ranges of their operands - - Predicates on the sizes of their operands - -The implementation of an inner product: - - We need an intermediate buffer. - - Must be unique in all enclosing indices. - - Must be unique in all indices except one being reduced. - - Output is the iteration over the resulting buffer. - -The instantiation of intermediate buffers: - - There is a two-way dependence between buffers and code making this tricky. - - Can we generate code with no-buffers? - - Yes, we can always recompute, except for cases where buffers are explicit. - - Can we generate code with all buffers? - - Yes, slow as a dog and horribly memory inefficient. - - What are buffers anyway? - - Explicit names for values. - - Helpful for reductions such as inner products. - - Helpful for black box handling of FFTs. - - Buffers can provide an iteration context. - - When do we want to instantiate buffers? - - Anywhere they are instantiated explicitly. - - Whenever an expression could be subject to loop-invariant code motion. - -Code generation strategy: - - Input is set of predicates and quantified expressions - - FFT is equivalent to IO in a monadic context? - - We need to be able to explicitly represent buffers in our input thanks to black boxes. - - Expressions have an associated set of indices required to uniquely identify their value. - - Assignment results in fictional expression? - - What is the dimensionality of the FFT expression? - - Do we treat the FFT-buffer as special buffer sized value? - - Better to try to pretend it's a value with special independent dense-ranged iteration indices? - - IO is the assignment operation. - - Assignment is the guarantee a certain set of operations *will* be executed. - - Do we need to handle reduction explicitly in our tree? - -Our database: - - Some form of quantifier (how close to a loop body this is, is uncertain). At least an index. - - Predicates on the range of derived expressions. - - Predicates on the lower and upper bounds of loop indices. - - How do we distinguish predicates which are given from those that must be satisfied? - -Code generation: - - Input a set of quantified expressions. - - Try to move redundant expressions outward. - - Move overly quantified expressions to temporary variables. - - Fused loop generation: - - We need to explicitly handle the different cases. - - We look for expressions located in contexts in which we have a loop variable binding. - - In the dense case, we can explicitly index the loop since our transformation is invertible. - -Handling assignments: - - The integrals_kinetic assignment is tricky since we only want to handle values in the sparsity pattern. - - Correct iteration over non-zero values is preferable. - -Predicates: - LowerBound(index) - UpperBound(index) - IsKnown(conditional) - Require(conditional) - where conditionals are (expression {<,==,>} expression) - -Expressions: - Expressions are integer or floating point - IntegerExpressions are {Constant, a+b, a-b, a*b, IntegerArrayAccess[IntegerExpression]} - FloatingPointExpressions are {Constant, a+b, a-b, a*b, a/b, FloatingPointArrayAccess[IntegerExpression]} - -Dimensionality of Expressions: - All variables on the rhs of an expression determine a unique instance in which it is valid. - -Reduction operations: - Disjoint reduction, we remove an index. - Summation reduction, we sum over an index. - - In disjoint reduction, we are removing an actual index. - In the summation reduction, we want to reduce over a derived index. - -Buffer allocation: - Buffers in our tree define explicit rhs indices. - We try to define explicit bounds and widths for the rhs indices. - - Reductions explictly name stored values. - We look for IsKnowns conditions on size of the buffer. - - For any buffer within some synthesized loop set - We need to determine the correct size. - Only iterative over approprate ranges. - Store the invertable index mapping. - - - - - - diff --git a/docs/implementation_issues.tex b/docs/implementation_issues.tex deleted file mode 100644 index e175baa..0000000 --- a/docs/implementation_issues.tex +++ /dev/null @@ -1,95 +0,0 @@ -% vim:tw=80 -\documentclass[a4paper, twoside, pdflatex]{article} -\usepackage[utf8x]{inputenc} - -\title{Observations on Compiling a Quantum Chemistry DSL to a ONETEP Fortran -Implementation} -\author{Francis Russell} -\date{} - -\begin{document} -\maketitle - -\section{The Task} - -Compile a domain specific language describing a quantity computed in the -quantum chemistry code ONETEP to a Fortran implementation capable of interacting -with ONTEP. - -Some points to note: - -\begin{itemize} - -\item We posses hand-optimised implementations of the code corresponding to DSL -expressions we wish to be able to compile in ONETEP already. - -\item The code generated needs to interact with ONETEP data structures either -directly, or via ONETEP supplied methods. The interface to these operands are -highly performance-oriented and are intended to satisfy a number of requirements -specific to ONETEP. However, we do not wish to over-specialise the code -generator to these data structures in case they change, or we decide to target -other applications other than ONETEP. - -\end{itemize} - -\section{The Domain Specific Language} - -As an example, we consider a fragment of our DSL that specifies the computation -of a matrix as performed the {\tt integrals\_kinetic} function on ONETEP. - -\begin{verbatim} -kinet[alpha, beta] = - inner(bra[alpha], reciprocal(laplacian(reciprocal(fftbox(ket[beta])))*-0.5)) -\end{verbatim} - -Here, {\tt alpha} and {\tt beta} represent indices that are correspond to a -quantification of the indices over all valid values. Indices are defined to have -the same value at all points of use in the expression. - -In the above expression, the indices have been used to specify the rows and -columns of matrix {\tt kinet} and specific functions in the function sets {\tt -bra} and {\tt ket}. - -We have currently chosen not to expose the indices that correspond to the -spatial co-ordinates of the basis function data itself. - -\subsection{Validation} - -Given that indices can be bound in the expressions we can write in our DSL -arbitrarily, it is unclear whether or not it is possible to construct -expressions which are invalid due to a particular choice of index bindings. In -this case we take \emph{invalid} to mean that it is impossible to evaluate an -assignment. Ignoring mathematical concerns, we require that for an expression to -be valid: - -\begin{itemize} - -\item It it should be possible to synthesize a loop nest that -evaluates the specified assignment. - -\item The interpretation of value of the LHS of the assignment is unique. - -\end{itemize} - -We consider the following expression: - -\begin{verbatim} -a[i,j] = c[i][k] * d[j][k] -\end{verbatim} - -We consider this malformed since dimension {\tt k} exists on the RHS but not on -the LHS. We do not adopt \emph{Einstein notation} in that a repeated index -implies a summation over that index. - -\section{Implementation Issues} - -In this section I discuss the various issues encountered in the initial creation -of the code generator. - -\section{The Naïve Implementation} - -The most naïve implementation is to attempt to evaluate the RHS of the -assignment using the same strategy one would use for a simple scalar expression -tree. - -\end{document} diff --git a/docs/operator_semantics.txt b/docs/operator_semantics.txt deleted file mode 100644 index b02f74d..0000000 --- a/docs/operator_semantics.txt +++ /dev/null @@ -1,30 +0,0 @@ -- fftbox - -Conceptually, this is a block-wise operator. However, it is just a -spatial restriction that can be applied on a per-point basis, but -requires point co-ordinate information to do so. - -- * - -The scaling operator is point-wise and does not require access to any -index values. - -- reciprocal - -This operator is block-wise. Since we intend to implement is as a black -box, we need to incorporate reading into and out of a buffer passed to -a FFT implementation. - -This operator will define new output indices. - -- laplacian - -Like the fftbox operator, this is conceptually block-wise but can be -applied on a per-point basis. - -- inner - -Operates on blocks, can be applied point-wise. Is the only operator we -currently have that binds spatial indices together. Typically will -require one of the operands to support random access. Which one should -be a choice for the compiler.