The search package is responsible for taking a text string "God & loves & world" and turning it into a series of calls to Book and Passage to find the answer. I am a little concerned that this design is a little complex, but I'm sure that stewing on it will make things come clear.

The Current Design

These are the requirements of the search.Engine:

This is how the current design works. The user types a string like "aaron ~5 & moses & thesarus talk". (This means; find Moses within 5 verses of Aaron alongside some speach type activity). The Engine prepends the string "/" and tokenizes this into a Vector of SearchWords - one SearchWord for each part of the search string. The Vector (using Java array syntax) looks like this: { "/", "aaron", "~", "5", "&", "moses", "&", "thesaurus", "talk" }.

SearchWord is an interface, implemented in several ways. The Engine selects which SearchWordto use from a Hashtable of SearchWords. The members of this Hashtable are the available SearchWords keyed on a token (in this example the tokens are /, ~, &, & and thesaurus). A DefaultParamWord is created for the words in the search string that do not have keys in the Hashtable (aaron, moses and talk)

The Vector is better understood like this:

Each of these 9 elements in the Vector is a SearchWord. The first element on each line (/, ~, & and &) is a CommandWord, the others (aaron, 5, moses, thesaurus and talk) are ParameterWords. CommandWord and ParamWord inherit from SearchWord.

So in other word you could write the Vector like this, note the new bullet points are for each CommandWord, the Vector is strictly 1D an does not care at all for the difference between CommandWords and ParamWords:

It is worth noting that all the DefaultParamWords are created from unknown tokens. The other SearchWords (both CommandWords (/, ~ and &) and the ParamWord (thesaurus)) were members of the Hashtable in the Engine.

The search Engine loops, taking an element from the Vector - expecting it to be a CommandWord and calling CommandWord.updatePassage(). These CommandWords have the opportunity to take elements from the Vector and treat them as ParamWords. Any error is a ClassCastException which is caught and translated into a sensible error message.

Historical Designs

This does NOT represent the current design. I've left it here to show the steps I went through to get to the current design. There were 2 possible designs. The smart engine model and the smart data model. The latter won. The ideas were like this:

Smart Engine Model

The engine understands how to parse the search string into a series of calls to the relevant places. The engine is extensible by adding new 'commands' (Which must follow an SearchWords interface - now deleted). This model has the advantage of simplicity, and memory-efficiency.

Smart Data Model

The engine simply turns the search string into a data structure, the nodes of this document are instansiated as classes that follow an interface with a getAnswer() interface. Calling getAnswer() on the root node recurses down to find the answer. The big advantage of this model is that it can be readily extended to several types of interface - from the most basic GUI find dialog to a ridiculously powerful command line version.

I toyed with an XML based engine. The Engine parses the search string into an XML Document. Something like this:

XML representation of the above search, and the code that implements it

<search>                        // ref = new Passage();
  <add>                         // ref.addAll(
    <word>aaron</word>          //   default_bible.getPassages("aaron")
  </add>                        // );
  <blur>5</blur>                // ref.blur(5);
  <retain>                      // ref.retainAll(
    <word>moses</word>          //   default_bible.getPassages("moses")
  </retain>                     // );
  <add>                         // ref.addAll(
    <words>                     //   default_bible.getPassages(
      <thesarus>talk</thesarus> //     thesarus.getSynonyms("talk")
    </words>                    //   )
  </add>                        // );
</search>                       // return ref;

The benefit of this is that it allows us to easily remote the whole search engine. I seem to have an XML disease, so why shouldn't it affect here!

However I decided that a remote search engine was of little benefit since the individual SearchWords can be remoted via a very simple stub - giving an engine that can be remoted piecemeal. The only drawback to this solution is on high latency networks (erm like the Internet) where a set of simple requests can take a lot longer than a single complex one. However I am sure that I could XMLize or serialize the Vector invented above.

SoundEx

Some code to do soundex matching ...

// create object listing the SOUNDEX values for each letter
// -1 indicates that the letter is not coded, but is used for coding
//  1 is for BFPV
//  2 is for CGJKQSXZ
//  3 is for DT
//  4 is for L
//  5 is for MN my home state
//  6 is for R
function makesoundex()
{
    this.a = -1
    this.b =  1
    this.c =  2
    this.d =  3
    this.e = -1
    this.f =  1
    this.g =  2
    this.h = -1
    this.i = -1
    this.j =  2
    this.k =  2
    this.l =  4
    this.m =  5
    this.n =  5
    this.o = -1
    this.p =  1
    this.q =  2
    this.r =  6
    this.s =  2
    this.t =  3
    this.u = -1
    this.v =  1
    this.w = -1
    this.x =  2
    this.y = -1
    this.z =  2
}

var sndx=new makesoundex()

// check to see that the input is valid
function isSurname(name)
{
    if (name=="" || name==null)
    {
        alert("Please enter surname for which to generate SOUNDEX code.")
        return false
    }
    else
    {
        for (var i=0; i='a' && letter<='z' || letter>='A' && letter<='Z'))
            {
                alert("Please enter only letters in the surname.")
                return false
            }
        }
    }

    return true
}

// Collapse out directly adjacent sounds
// 1. Assume that surname.length>=1
// 2. Assume that surname contains only lowercase letters
function collapse(surname)
{
    if (surname.length==1)
    {
        return surname
    }

    var lname=(document.myform.surname.value)
    document.myform.lname.value=lname
    var right=collapse(surname.substring(1,surname.length))

    if (sndx[surname.charAt(0)]==sndx[right.charAt(0)])
    {
        return surname.charAt(0)+right.substring(1,right.length)
    }

    return surname.charAt(0)+right  
}

// Compute the SOUNDEX code for the surname
function soundex(form)
{
    form.result.value=""
    if (!isSurname(form.surname.value))
    {
        return
    }
      
    var stage1=collapse(form.surname.value.toLowerCase())
    form.result.value+=stage1.charAt(0).toUpperCase() // Retain first letter
    form.result.value+="-" // Separate letter with a dash
    var stage2=stage1.substring(1,stage1.length)
    var count=0

    for (var i=0; i0)
        {
            form.result.value+=
            sndx[stage2.charAt(i)]
        count++
    }

    } for (;count<3; count++)
    {
        form.result.value+="0"
    }

    form.surname.select()
    form.surname.focus()
}