%---TeX-start-of-header--- \documentclass[titlepage]{article} \usepackage{mathematica,figi, alltt} \mmanobreak \newenvironment{implementation}% {\vspace{-0.4cm}\footnotesize\textbf{Implementation notes: }}{\bigskip} \begin{comment} \begin{mathematica} <Infinity]; zero = 0.00001; ohide = DisplayFunction->Identity; oshow = DisplayFunction -> $DisplayFunction; ojoin = PlotJoined -> True; oall = PlotRange->All; osurf = HiddenSurface->False; deg = N[Degree]; pi = N[Pi]; \end{mathematica} \end{comment} \begin{document} \psfiginit %---TeX-end-of-header--- \author{Tomasz J.\ Cholewo} \title{Design and Analysis of Computer Algorithms, EMCS\,619\\ Computer Project \#2\\[0.5in] Optimal Binary Search Trees} \maketitle %==================================================================== \section{Listings of \Mma{} Functions} %==================================================================== \subsection{Function \texttt{minpos}} Supplementary function \texttt{minpos} returns a list consisting of an index of the first smallest element of a list~\texttt{l} and of this smallest element. \begin{mathematica} minpos[l_List] := Module[{i, mpos = 1, minval = l[[1]]}, For[i = 2, i <= Length[l], i++, If[ l[[i]] < minval, mpos = i; minval = l[[i]] ] ]; {mpos, minval} (* return value *) ]; \end{mathematica} %==================================================================== \subsection{Function \texttt{OptimalSearchTree} -- Dynamical Programming Version} Function \texttt{OptimalSearchTree} returns a list consisting of matrices \texttt{A} representing cost and \texttt{root} defining an optimal binary search tree for a sorted list of elements described by a given probability list \texttt{p}. \begin{mathematica} OptimalSearchTree[p_List] := Module[ {A, root, n = Length[p], i, j, diag, minval}, (* local variables *) A = Table["-", {i, 1, n + 1}, {j, 0, n}]; (* fancy initialization of arrays *) root = Table["-", {i, 1, n}, {j, 1, n}]; For[i = 1, i <= n, i++, A[[i, i + 1]] = p[[i]]; (* upper diagonal *) A[[i, i]] = 0; (* diagonal *) root[[i, i]] = i; (* diagonal of root *) ]; A[[n + 1, n + 1]] = 0; (* last element of diagonal *) For[diag = 1, diag < n, diag++, (* diag = diagonal no. *) For[i = 1, i <= n - diag, i++, (* i = row *) j = i + diag; (* j = column *) {root[[i, j]], minval} = (* index and value of the smallest sum *) minpos[ Table[ A[[i, k]] + A[[k + 1, j + 1]], {k, i, j}] ]; root[[i, j]] += i - 1; (* index must be increased by the position of list *) A[[i, j + 1]] = minval + Sum[ p[[k]], {k, i, j}]; (* fill the entry *) ] ]; { A, root } ] \end{mathematica} \begin{implementation} Indices of lists in \Mma{} start at 1, what forced some modifications to the original textbook algorithm. Index returned from \texttt{minpos} must be increased because it does not reflect the relative position of the list being its argument. \end{implementation} %==================================================================== \newpage \subsection{Function \texttt{RecurrentOptimalSearchTree} -- Recurrent Version} The possibility of creating functions which ``remember'' once calculated values makes the concept of the dynamical programming less appealing. The recurrent version can be nearly as effective as the dynamical one. Function \texttt{RecurrentOptimalSearchTree} returns the same results as the function \texttt{OptimalSearchTree}. \begin{mathematica} RecurrentOptimalSearchTree[p_List] := Module[ {A, root, n = Length[p]}, (* local functions and variable *) A[i_, i_] := p[[i]]; (* upper diagonal *) A[i_, j_] := 0 /; (j == i - 1); (* diagonal *) A[i_, j_] := (A[i, j] = Module[{mpos, minval}, {mpos, minval} = minpos[ Table[ A[i, k - 1] + A[k + 1, j], {k, i, j}] ]; root[i, j] = mpos + i - 1; (* store the value of current root *) minval + Sum[p[[k]], {k, i, j}] (* result value *) ]) /; (j >= i); (* def only for upper diagonals *) A[i_, j_] := "-"; (* the rest of entries is marked undefined *) root[i_, i_] := i; (* diagonal of root *) root[i_, j_] := "-"; (* initial settings *) A[1, n]; (* start recursion but discard the average search time *) { Table[A[i, j], {i, 1, n + 1}, {j, 0, n}], Table[root[i,j], {i, n}, {j, n}] } (* artificially collected results *) ] \end{mathematica} \begin{implementation} Idiom \verb+f[x_] := f[x] = ...+ constructs a function which can remember the values it already calculated, because the second assignment is not delayed (=) and \verb+x+ in \verb+f[x]+ is not a pattern but a value for which the whole function was called. \end{implementation} %==================================================================== \subsection{Output Functions} Function \texttt{TreePreorder} returns a list of vertices ordered according to the preorder traversal method in the tree described by the \texttt{root} array. \begin{mathematica} TreePreorder[root_List] := Module[{ST}, ST[i_Integer, j_Integer] := {} /; (i > j); ST[i_Integer, j_Integer] := Module[{k = root[[i, j]]}, Join[ ST[i, k - 1], ST[k + 1, j], {k} ] (* preorder visits children first *) ]; ST[1, Length[root]] ]; \end{mathematica} %==================================================================== Function \texttt{TreeList} returns a list representation of a tree described by the \texttt{root} table (\verb+{}+ denotes an empty tree). \begin{mathematica} TreeList[root_List] := Module[{ST}, ST[i_Integer, j_Integer] := {} /; (i > j); ST[i_Integer, j_Integer] := Module[{k = root[[i, j]]}, List[ ST[i, k - 1], k, ST[k + 1, j] ] ]; ST[1, Length[root]] ]; \end{mathematica} \begin{implementation} Function \texttt{ST[i,j]} finds a sublist for vertices \texttt{i} to \texttt{j}. \end{implementation} \newpage %==================================================================== Function \texttt{ShowTree} graphically presents a tree encoded in the root table. \begin{mathematica} ShowTree[root_List] := Module[{ST}, ST[i_Integer, j_Integer] := {} /; (j <= i); ST[i_Integer, j_Integer] := Module[{k = root[[i, j]]}, Join[If[ k > i, {{k, root[[i, k - 1]]}}, {}], ST[i, k - 1], If[ k < j, {{k, root[[k + 1, j]]}}, {}], ST[k + 1, j] ] ]; ShowLabeledGraph[ RootedEmbedding[ FromUnorderedPairs[ ST[1, Length[root]] ], root[[1, -1]] ]] ]; \end{mathematica} \begin{implementation} Function \texttt{ST[i,j]} finds a list of edges of the subtree consisting of vertices from \texttt{i} to \texttt{j}. List for all vertices is displayed using rooted embedding based on the root of the whole tree. \end{implementation} %==================================================================== \newpage \section{Tests} \subsection{Test of the \texttt{minpos} Function} \begin{mathematica} minpos[{20, 10, 6, 40, 11, 2, 8}] . = {6, 2} \end{mathematica} %==================================================================== \subsection{First Data Set} Given probabilities: \begin{mathematica} p1 = { 0.2, 0.24, 0.16, 0.28, 0.12 }; \end{mathematica} Calculate the optimal tree: \begin{mathematica} {A1, root1} = OptimalSearchTree[p1]; \end{mathematica} The average search time (it is the last element in the first row): \begin{mathematica} A1[[1, -1]] . = 2.04 \end{mathematica} Cost array: \begin{mathematica} A1 // TableForm[#, TableSpacing->{0, 2}]& . //TableForm= 0 0.2 0.64 0.96 1.68 2.04 - 0 0.24 0.56 1.2 1.48 - - 0 0.16 0.6 0.84 - - - 0 0.28 0.52 - - - - 0 0.12 - - - - - 0 \end{mathematica} Root positions array: \begin{mathematica} root1 // TableForm[#, TableSpacing->{0, 2}]& . //TableForm= 1 2 2 2 2 - 2 2 3 4 - - 3 4 4 - - - 4 4 - - - - 5 \end{mathematica} Vertices in the preorder traversal order: \begin{mathematica} TreePreorder[root1] . = {1, 3, 5, 4, 2} \end{mathematica} The resulting tree in list form: \begin{mathematica} TreeList[root1] . = {{{}, 1, {}}, 2, {{{}, 3, {}}, 4, {{}, 5, {}}}} \end{mathematica} It can also be presented using \Mma{} \texttt{TreeForm} function showing levels of nesting. \begin{mathematica} TreeForm[ TreeList[root1] ] /. List -> L . //TreeForm= L[| , 2, | ] L[| , 1, | ] L[| , 4, | ] L[] L[] L[| , 3, | ] L[| , 5, | ] L[] L[] L[] L[] \end{mathematica} Graphical rendering of the tree is shown in Figure~\ref{obst1}. \begin{mathematica} Display["obst1.eps", Show[ ShowTree[root1], AspectRatio->0.7 ], "EPS"]; \end{mathematica} \figi{obst1}{Optimal binary search tree for the first set of probabilities.} %================================================================== \subsection{Second Data Set} Probabilities: \begin{mathematica} p2 = { 0.06, 0.11, 0.08, 0.20, 0.05, 0.12, 0.17, 0.03, 0.10, 0.08 }; \end{mathematica} We can check if they really sum up to 1. \begin{mathematica} Plus @@ p2 . = 1. \end{mathematica} Calculate the optimum tree using the recurrent version this time: \begin{mathematica} {A2, root2} = RecurrentOptimalSearchTree[p2]; \end{mathematica} The average search time: \begin{mathematica} A2[[1, -1]] . = 2.48 \end{mathematica} Cost array: \begin{mathematica} A2 // TableForm[#, TableSpacing->{0, 2}]& . //TableForm= 0 0.06 0.23 0.39 0.84 0.94 1.23 1.74 1.83 2.16 2.48 - 0 0.11 0.27 0.66 0.76 1.05 1.56 1.65 1.98 2.3 - - 0 0.08 0.36 0.46 0.75 1.25 1.34 1.66 1.9 - - - 0 0.2 0.3 0.59 1.01 1.1 1.42 1.66 - - - - 0 0.05 0.22 0.56 0.62 0.85 1.09 - - - - - 0 0.12 0.41 0.47 0.7 0.94 - - - - - - 0 0.17 0.23 0.46 0.69 - - - - - - - 0 0.03 0.16 0.32 - - - - - - - - 0 0.1 0.26 - - - - - - - - - 0 0.08 - - - - - - - - - - 0 \end{mathematica} Root positions array: \begin{mathematica} root2 // TableForm[#, TableSpacing->{0, 2}]& . //TableForm= 1 2 2 4 4 4 4 4 4 4 - 2 2 4 4 4 4 4 4 4 - - 3 4 4 4 6 6 7 7 - - - 4 4 4 6 6 7 7 - - - - 5 6 6 7 7 7 - - - - - 6 7 7 7 7 - - - - - - 7 7 7 9 - - - - - - - 8 9 9 - - - - - - - - 9 9 - - - - - - - - - 10 \end{mathematica} Vertices in the preorder traversal order: \begin{mathematica} TreePreorder[root2] . = {1, 3, 2, 5, 6, 8, 10, 9, 7, 4} \end{mathematica} The resulting tree in list form: \begin{mathematica} TreeList[root2] . = {{{{}, 1, {}}, 2, {{}, 3, {}}}, 4, {{{{}, 5, {}}, 6, {}}, 7, {{{}, 8, {}}, 9, {{}, 10, {}}}}} \end{mathematica} Graphical rendering of the tree is shown in Figure~\ref{obst2}. \begin{mathematica} Display["obst2.eps", Show[ShowTree[root2], AspectRatio->0.7], "EPS"]; \end{mathematica} \figi{obst2}{Optimal binary search tree for the second set of probabilities.} %==================================================================== %==================================================================== \newpage \section{Discussion} \subsection{Pidgin--Pascal Formulation of the \texttt{TreeList} Function} \begin{alltt} \textbf{function} TreeList(i, j: integer): list; \textbf{begin} \textbf{if} i > j \textbf{then} \textbf{return} \{\} \textbf{else begin} k := root[i, j]; \textbf{return} \textit{a list consisting of} TreeList(i, k - 1)\textit{,} k\textit{, and} TreeList(k + 1, j); \textbf{end}; \textbf{end}; \end{alltt} \subsection{Efficiency} The computational complexities of both functions \texttt{OptimalSearchTree} and \texttt{RecurrentOptimalSearchTree} are equal to $\Theta(n^3)$. One can ``prove'' that they have the same order of complexity by comparing the following execution times for the two data sets considered: \begin{mathematica} Print[{Timing[OptimalSearchTree[p1]][[1]], Timing[RecurrentOptimalSearchTree[p1]][[1]]}]; Print[{Timing[OptimalSearchTree[p2]][[1]], Timing[RecurrentOptimalSearchTree[p2]][[1]]}] . {0.02 Second, 0.05 Second} {0.1 Second, 0.18 Second} \end{mathematica} As one can see the recurrent version has 60\% greater overhead (caused mainly by pattern matching) but the ratios of time remain nearly identical. Function \texttt{TreeList} has complexity $\Theta(n)$ because in each recursive call it visits a single vertex. %==================================================================== \end{document}