VM details | ![]() |
![]() |
![]() |
![]() |
For those who prefer their descriptions to be in the form of source code, the compilable example here should help. It uses exactly the same code as YAC to implement the virtual machine, ( declaration and definition), but the symbol tables and computational stack are hard-coded rather than generated dynamically by the parser.
This example code illustrates how the virtual machine evaluates numerical expressions similar to the one described earlier. However, it then goes on to show:
That is, it trials the entire functionality of the virtual machine. Here, I compile it:
g++ -DDEBUG -I$HOME/boost/cvs -o example_vm example_vm.cpp
(The -DDEBUG flag causes the machine to generate diagnostic output as the stack is evaluated.)
What follows below is a description of some of the requirements that the design addresses.
The computational stack is nothing more than a collection of polymorphic nodes:
namespace yac {
struct node {
virtual ~node() {}
};
struct stack {
std::list<boost::shared_ptr<node> > data;
};
} // namespace yac
Whilst it is convenient to use shared_ptr to manage the data storage, this is an implementation detail. The user is interested in a collection of nodes, so all access to the data is controlled by the begin() and end() member functions that return boost::indirect_iterators to the data: Such an iterator, when dereferenced, returns a reference to a node rather than to a shared_ptr<node>.
class stack {
typedef std::list<boost::shared_ptr<node> > container_t;
container_t data;
public:
typedef boost::indirect_iterator<container_t::iterator> iterator;
iterator begin();
iterator end();
};
In fact, the class goes on to wrap only those member functions of std::list that are actually used.
The copy semantics of list<shared_ptr<node> > are completely different to those of the hypothetical list<node>. This is unfortunate, because evaluation of the virtual machine requires that a copy be made of the stack and that this copy be transformed by the unwinding process. Moreover, when user-defined functions are used, evaluation requires that the data in the copied stack be altered to ensure that it is consistent with the symbol table of local variables. (The ability to define recursive functions means that each call to evaluate a user-defined function must use its own copy of any local variables data.)
In order to provide the required copy semantics, yac::stack defines a copy constructor and assignment operators that in turn invoke the helper function copy:
struct node {
virtual boost::shared_ptr<node> clone() const = 0;
};
void copy(stack::container_t & lhs, stack::container_t const & rhs)
{
lhs.clear();
stack::container_t::const_iterator it = rhs.begin();
stack::container_t::const_iterator const end = rhs.end();
for(; it != end; ++it)
lhs.push_back((*it)->clone());
}
That is, copy generates a true copy of the raw data. Doing so places a requirement on all classes deriving from node that they should implement a clone member function that generates a copy of the data rather than increment the shared_ptr reference count.
Generally speaking, the computational stack is a simple list of data and operations. However, YAC supports conditional evaluation of both a||b and a?b:c where a, b and c can all be complicated expressions themselves. Such conditional expressions are not well represented by a simple list, so YAC implements an or_node and a branch_node that contain stack variables themselves. In the case of or_node, the stored stack represents the expression b in a||b. It is evaluated only if a evaluates to false. The branch_node stores two stacks, one for b and one for c in the expression a?b:c. Which stack is evaluated depends on the result of evaluating a.
Usually, these are mere implementation details of the two node types and the computational stack should be regarded as a simple list. An exception to this rule does exist, however. If the user defines a recursive function such as:
factorial(x) = (x < 0.01) ? 1 : x * factorial(x - 1)
then evaluation of, say, factorial(2) will require evaluations of factorial(1) and factorial(0) also. All three function calls should have separate symbol tables for the local variable x and so the stack representation of (x<0.01)?1:x*factorial(x-1) that is copied for each evaluation should be modified so that references to x point to the correct data. Perhaps some code will make this clearer:
// params contains the argument list to be passed to the function
double user_function::eval_priv(vector<double> const & params) const
{
BOOST_ASSERT(arg_names_.size() == arity_);
// Create a copy of the local variables symbol table.
symbol_table_t local_vars_copy = local_vars_;
// Assign values to the elements of this symbol table
// using the argument list data stored in params.
std::size_t const size = arg_names_.size();
for (std::size_t i = 0; i != size; ++i) {
double * const ptr = find(local_vars_copy, arg_names_[i].c_str());
BOOST_ASSERT(ptr);
*ptr = params[i];
}
// Ensure that the variable_nodes in stk_copy point to
// the appropriate entry in local_vars_copy (rather than
// those of local_vars_).
stack stk_copy = stk_;
update_stack(stk_copy, local_vars_copy, local_vars_, arg_names_);
// Evaluate the stack.
return evaluate(stk_copy);
}
It is the update_stack function only that requires a mechanism to iterate over the stored stacks of any or_nodes and branch_nodes in stk_copy. The implementation is trivially simple, using nbranches and branch virtual functions to access these stacks:
struct node {
virtual std::size_t nbranches() const { return 0; }
virtual stack * branch(std::size_t) { return 0; }
};
struct function_node : public node {...};
struct or_node : public function_node {
or_node(stack const & stk);
std::size_t nbranches() const { return 1; }
stack * branch(std::size_t i) { return i == 0 ? &rhs_stk_ : 0; }
private:
stack rhs_stk_;
};
struct branch_node : public function_node {
branch_node(stack const & stk1, stack const & stk2);
std::size_t nbranches() const { return 2; }
stack * branch(std::size_t i)
{ return (i > 1) ? 0 : (i == 0 ? &stk1_ : &stk2_); }
private:
stack stk1_, stk2_;
};
There are five primary node types visible to the evaluate function: number_node and function_node together represent the numeric expressions defined by the expression grammar. Evaluation of such an expressions proceeds in a manner identical to that described earlier. print_node simply outputs the evaluated expression to stdout. variable_node will assign the result of such an expression to an element of one of the variable symbol tables. func_def_node will assign a user_func_expression variable to an element in the functions symbol table. Almost all other node classes are specializations of function_node.
![]() |
![]() |
![]() |
Copyright © 2004 Angus Leeming
Distributed under the Boost Software License,
Version 1.0. (See accompanying file
LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt
)