Remove obsolete notes.

author Francis Russell <francis@unchartedbackwaters.co.uk>

Wed, 28 Nov 2012 17:51:52 +0000 (17:51 +0000)

committer Francis Russell <francis@unchartedbackwaters.co.uk>

Thu, 29 Nov 2012 14:13:06 +0000 (14:13 +0000)
author Francis Russell <francis@unchartedbackwaters.co.uk>
Wed, 28 Nov 2012 17:51:52 +0000 (17:51 +0000)
committer Francis Russell <francis@unchartedbackwaters.co.uk>
Thu, 29 Nov 2012 14:13:06 +0000 (14:13 +0000)
diff --git a/docs/allocation.txt b/docs/allocation.txt

deleted file mode 100644 (file)

index a0bd3f6..0000000
--- a/docs/allocation.txt
+++ /dev/null
@@ -1,37 +0,0 @@
-Problem: we need to be able to persist the output of operands to
-temporary buffers.
-
-
-Case examples
--------------
-
-- Matrix (treated as dense):
-1. Both indices have well defined ranges from 1 to n.
-2. Size of indices do not depend on each other.
-
-- Matrix (treated as sparse):
-1. Has a block sparse format.
-2. Depends on whether we know how to persist each block.
-3. Need to store offsets for each block.
-
-- PPD function set:
-1. PPD positions are dependent on all indices up to the ppd index.
-2. Tight-box positions are only dependent on function index.
-3. Data doesn't have especially well-defined ranges.
-4. Max range is bound of simulation cell (but what if multiple
-simulation cells?).
-5. We need explicit bounding to construct a FFT-box.
-
-
-Storage in a dense array
-------------------------
-
-1. If we know an index, we don't need to persist storage across it.
-2. When we persist storage, we can use known indices to restrict the
-size.
-
-Abstractions breakdown
-----------------------
-
-1. Spatial indices are typically independent since we cover rectangular
-regions.
diff --git a/docs/code_generation_strategies.txt b/docs/code_generation_strategies.txt

deleted file mode 100644 (file)

index b1d891f..0000000
--- a/docs/code_generation_strategies.txt
+++ /dev/null
@@ -1,116 +0,0 @@
-Some notes to help clarify things
----------------------------------
-
-1. Each node in the DSL's expression tree corresponds to a scalar value
-with 0 or more indices.
-
-2. Indices have dependencies on other indices. This will affect the loop
-nesting order.
-
-3. Nodes have dependencies on other nodes. This will affect where in
-loops certain expressions are computed.
-
-4. Nodes that destroy indices imply that information will be passed
-outside a loop. This will typically require allocation of storage.
-
-5. Nodes will currently assume that they get random access to the
-iteration space described by the indices they destroy.
-
-6. Nodes will have a prefix call that is used to set-up any data
-structure they need and iterate over their operands
-
-7. Nodes will have a call that allows access to their data, using their
-already defined indices.
-
-8. Nodes will have a postfix call that performs cleanup of any allocated
-data structures.
-
-Code Generation Strategy
-------------------------
-
-1. Topological traversal of expression tree.
-
-2. Only when nodes destroy an index is the loop for that expression
-created.
-
-3. DataSpaces (leaf nodes) are traversed in their native format.
-
-4. All other iteration spaces will be assumed to be dense, and will
-dynamically allocated memory in the dense format.
-
-5. We would expect that it would be possible to implement an inner
-product as requiring O(1) space if the loops are traversed in the
-correct order. However this requires that the indices that are not being
-reduced over are dense.
-
-In the case of removing a PPD-index, the x, y & z indices are derived
-and also dependent upon the specific PPD.
-
-The correct strategy would appear to be to destroy and re-create any
-indices that are derived (depend?) on the indices being eliminated.
-
-Code Generation Strategy 2
---------------------------
-
-1. Determine indices for all nodes in tree.
-
-2. Determine which indices are created and destroyed by each node.
-
-3. By default, we assume that all nodes assign their results to arrays,
-which then permit random access by the next operand.
-
-4. Nodes which share common prefixes can be moved inside the same loops.
-
-5. Nodes which share common prefixes can have their array temporary
-extends reduced.
-
-6. Nodes have two sets of indices
-   - Indices representing the size of data they produce.
-   - Indices representing the loops required to iterate over them.
-
-7. FFT nodes entirely destroy their spatial indices.
-
-8. Nodes access data by providing index bindings for all indices they destroy.
-
-9. Nodes set data by providing index bindings for all indices they create.
-
-Case Study
-----------
-
-An inner product between an entirely dense 3D data-set and one stored in PPD format.
-
-1. Let's say that an inner product doesn't entirely destroy its indices.
-
-2. Instead, the indices it produces are a composite of its input indices.
-
-3. Now both inputs generate indices that are not destroyed (sort of).
-
-4. We can now fuse provided that we can generate loops for the fused
-iteration.
-
-4. However, we only fuse on common prefixes, but the Function Set
-has the PPD-iteration loop as its outermost loop.
-
-5. We chose the common-prefix law in order to avoid redundant
-computation of values, but the PPD loop doesn't do this. The subsets the
-PPD loop iterates over are disjoint.
-
-6. Both block matrices and the PPD-representation have these external
-sorts of indices. Both iterate over disjoint subsets of the data. It
-would be nice to have more examples to fully characterise them.
-
-7. So when are we allowed to move the PPD iteration loop? Why is this
-such an issue? If in the integrals_kinetic example, we were able to move
-the PPD iteration loop outside our assignment loop (fixed # of PPDs per
-sphere), it would destroy everything. Clearly, it must be constrained
-somehow.
-
-8. The work by Kotlyar et al. focusses on how to construct the set of
-tuples required for iterations, but now how they are to be reduced. The
-PPD-loop cannot be moved because the reductions are non-trivial for a
-number of the quantum chemistry operators (e.g. FFT).
-
-9. All loop re-arrangements must respect two restrictions:
-   - Loops cannot move outside the loops they depend on (the obvious one).
-   - Loops cannot move above a node that destroys them. Why? 
-     It represents some form of implicit reduction over that index?
diff --git a/docs/combining_producers_consumers.txt b/docs/combining_producers_consumers.txt

deleted file mode 100644 (file)

index 32e8c84..0000000
--- a/docs/combining_producers_consumers.txt
+++ /dev/null
@@ -1,51 +0,0 @@
-How do we combine producers and consumers?
-------------------------------------------
-
-Assume some producer x generating points:
-
-Loop A
-  Loop B 
-    x(A, B, v)
-
-
-Case 1: Consumer y operates point-wise:
-
-Loop A
-  Loop B 
-    y(A, B, v)
-
-
-Case 2: Consumer y operates block-wise:
-
-Loop A
- Loop B
-   y_in(A, B, v)
-
-y_opaque
-
-Loop C
-  Loop D
-    y_out(C, D, v)
-
-
-In both cases, we can always substitute the point receiver of y into the
-body of loop B. However, depending on whether y is point-wise or
-block-wise, we may have intermediate code and the output expression may
-be either in the original loops, or new generated loops. How do we
-generalise the templating scheme for either strategy?
-
-We cannot just substitute y into x since code in y may exist outside of
-y's loops (the block-wise case). We cannot substitute x into y since the
-body of y may also need to exist within x's loops (the point-wise case).
-
-We have two types of granularity when considering producers and
-consumers:
-
-1. The type of object on which an operator conceptually operates.
-2. The number of elements each operator needs before it can operate.
-
-An fftbox operator operators conceptually on blocks, but can be applied
-on a point-wise basis since it merely filters point.
-
-A reciprocal operator (the FFT) requires all points in a spatial block
-before it can operate.
diff --git a/docs/expression_strategy.txt b/docs/expression_strategy.txt

deleted file mode 100644 (file)

index e5baae1..0000000
--- a/docs/expression_strategy.txt
+++ /dev/null
@@ -1,99 +0,0 @@
-Sources provide :
- - An access expression 
- - Expressions for computing offsets (i.e. in physical space)
- - Quantifiers across loop index
- - Predicates on the ranges of their operands
- - Predicates on the sizes of their operands
-
-The implementation of an inner product:
- - We need an intermediate buffer.
- - Must be unique in all enclosing indices.
- - Must be unique in all indices except one being reduced.
- - Output is the iteration over the resulting buffer.
-
-The instantiation of intermediate buffers:
- - There is a two-way dependence between buffers and code making this tricky.
- - Can we generate code with no-buffers?
-   - Yes, we can always recompute, except for cases where buffers are explicit.
- - Can we generate code with all buffers?
-   - Yes, slow as a dog and horribly memory inefficient.
- - What are buffers anyway?
-   - Explicit names for values.
-   - Helpful for reductions such as inner products.
-   - Helpful for black box handling of FFTs.
- - Buffers can provide an iteration context.
- - When do we want to instantiate buffers?
-   - Anywhere they are instantiated explicitly.
-   - Whenever an expression could be subject to loop-invariant code motion.
-
-Code generation strategy:
- - Input is set of predicates and quantified expressions
- - FFT is equivalent to IO in a monadic context?
- - We need to be able to explicitly represent buffers in our input thanks to black boxes.
- - Expressions have an associated set of indices required to uniquely identify their value.
- - Assignment results in fictional expression?
- - What is the dimensionality of the FFT expression? 
- - Do we treat the FFT-buffer as special buffer sized value?
- - Better to try to pretend it's a value with special independent dense-ranged iteration indices?
- - IO is the assignment operation.
- - Assignment is the guarantee a certain set of operations *will* be executed.
- - Do we need to handle reduction explicitly in our tree?
-
-Our database:
- - Some form of quantifier (how close to a loop body this is, is uncertain). At least an index.
- - Predicates on the range of derived expressions.
- - Predicates on the lower and upper bounds of loop indices.
- - How do we distinguish predicates which are given from those that must be satisfied?
-
-Code generation:
- - Input a set of quantified expressions.
- - Try to move redundant expressions outward.
- - Move overly quantified expressions to temporary variables.
- - Fused loop generation: 
-   - We need to explicitly handle the different cases.
-   - We look for expressions located in contexts in which we have a loop variable binding.
-   - In the dense case, we can explicitly index the loop since our transformation is invertible.
-
-Handling assignments:
- - The integrals_kinetic assignment is tricky since we only want to handle values in the sparsity pattern.
- - Correct iteration over non-zero values is preferable.
-
-Predicates:
- LowerBound(index)
- UpperBound(index)
- IsKnown(conditional)
- Require(conditional)
- where conditionals are (expression {<,==,>} expression)
-
-Expressions:
- Expressions are integer or floating point
- IntegerExpressions are {Constant, a+b, a-b, a*b, IntegerArrayAccess[IntegerExpression]}
- FloatingPointExpressions are {Constant, a+b, a-b, a*b, a/b, FloatingPointArrayAccess[IntegerExpression]}
-
-Dimensionality of Expressions:
- All variables on the rhs of an expression determine a unique instance in which it is valid.
-
-Reduction operations:
- Disjoint reduction, we remove an index.
- Summation reduction, we sum over an index.
-
- In disjoint reduction, we are removing an actual index.
- In the summation reduction, we want to reduce over a derived index.
-
-Buffer allocation:
- Buffers in our tree define explicit rhs indices.
- We try to define explicit bounds and widths for the rhs indices.
-
- Reductions explictly name stored values.
- We look for IsKnowns conditions on size of the buffer.
-
- For any buffer within some synthesized loop set
- We need to determine the correct size.
- Only iterative over approprate ranges.
- Store the invertable index mapping.
-
-
-
-
-
-
diff --git a/docs/implementation_issues.tex b/docs/implementation_issues.tex

deleted file mode 100644 (file)

index e175baa..0000000
--- a/docs/implementation_issues.tex
+++ /dev/null
@@ -1,95 +0,0 @@
-% vim:tw=80
-\documentclass[a4paper, twoside, pdflatex]{article}
-\usepackage[utf8x]{inputenc}
-
-\title{Observations on Compiling a Quantum Chemistry DSL to a ONETEP Fortran
-Implementation}
-\author{Francis Russell}
-\date{}
-
-\begin{document}
-\maketitle
-
-\section{The Task}
-
-Compile a domain specific language describing a quantity computed in the
-quantum chemistry code ONETEP to a Fortran implementation capable of interacting
-with ONTEP.
-
-Some points to note:
-
-\begin{itemize}
-
-\item We posses hand-optimised implementations of the code corresponding to DSL
-expressions we wish to be able to compile in ONETEP already.
-
-\item The code generated needs to interact with ONETEP data structures either
-directly, or via ONETEP supplied methods. The interface to these operands are
-highly performance-oriented and are intended to satisfy a number of requirements
-specific to ONETEP. However, we do not wish to over-specialise the code
-generator to these data structures in case they change, or we decide to target
-other applications other than ONETEP.
-
-\end{itemize}
-
-\section{The Domain Specific Language}
-
-As an example, we consider a fragment of our DSL that specifies the computation
-of a matrix as performed the {\tt integrals\_kinetic} function on ONETEP.
-
-\begin{verbatim}
-kinet[alpha, beta] = 
-  inner(bra[alpha], reciprocal(laplacian(reciprocal(fftbox(ket[beta])))*-0.5))
-\end{verbatim}
-
-Here, {\tt alpha} and {\tt beta} represent indices that are correspond to a
-quantification of the indices over all valid values. Indices are defined to have
-the same value at all points of use in the expression.
-
-In the above expression, the indices have been used to specify the rows and
-columns of matrix {\tt kinet} and specific functions in the function sets {\tt
-bra} and {\tt ket}.
-
-We have currently chosen not to expose the indices that correspond to the
-spatial co-ordinates of the basis function data itself.
-
-\subsection{Validation}
-
-Given that indices can be bound in the expressions we can write in our DSL
-arbitrarily, it is unclear whether or not it is possible to construct
-expressions which are invalid due to a particular choice of index bindings. In
-this case we take \emph{invalid} to mean that it is impossible to evaluate an
-assignment. Ignoring mathematical concerns, we require that for an expression to
-be valid:
-
-\begin{itemize}
-
-\item It it should be possible to synthesize a loop nest that
-evaluates the specified assignment.
-
-\item The interpretation of value of the LHS of the assignment is unique.
-
-\end{itemize}
-
-We consider the following expression:
-
-\begin{verbatim}
-a[i,j] = c[i][k] * d[j][k]
-\end{verbatim}
-
-We consider this malformed since dimension {\tt k} exists on the RHS but not on
-the LHS. We do not adopt \emph{Einstein notation} in that a repeated index
-implies a summation over that index.
-
-\section{Implementation Issues}
-
-In this section I discuss the various issues encountered in the initial creation
-of the code generator.
-
-\section{The Naïve Implementation}
-
-The most naïve implementation is to attempt to evaluate the RHS of the
-assignment using the same strategy one would use for a simple scalar expression
-tree.
-
-\end{document}
diff --git a/docs/operator_semantics.txt b/docs/operator_semantics.txt

deleted file mode 100644 (file)

index b02f74d..0000000
--- a/docs/operator_semantics.txt
+++ /dev/null
@@ -1,30 +0,0 @@
-- fftbox
-
-Conceptually, this is a block-wise operator. However, it is just a
-spatial restriction that can be applied on a per-point basis, but
-requires point co-ordinate information to do so.
-
-- *
-
-The scaling operator is point-wise and does not require access to any
-index values.
-
-- reciprocal
-
-This operator is block-wise. Since we intend to implement is as a black
-box, we need to incorporate reading into and out of a buffer passed to
-a FFT implementation.
-
-This operator will define new output indices.
-
-- laplacian
-
-Like the fftbox operator, this is conceptually block-wise but can be
-applied on a per-point basis.
-
-- inner
-
-Operates on blocks, can be applied point-wise. Is the only operator we
-currently have that binds spatial indices together. Typically will
-require one of the operands to support random access. Which one should
-be a choice for the compiler.
author	Francis Russell <francis@unchartedbackwaters.co.uk>
	Wed, 28 Nov 2012 17:51:52 +0000 (17:51 +0000)
committer	Francis Russell <francis@unchartedbackwaters.co.uk>
	Thu, 29 Nov 2012 14:13:06 +0000 (14:13 +0000)
docs/allocation.txt	[deleted file]	patch \| blob \| history
docs/code_generation_strategies.txt	[deleted file]	patch \| blob \| history
docs/combining_producers_consumers.txt	[deleted file]	patch \| blob \| history
docs/expression_strategy.txt	[deleted file]	patch \| blob \| history
docs/implementation_issues.tex	[deleted file]	patch \| blob \| history
docs/operator_semantics.txt	[deleted file]	patch \| blob \| history