 
 
 
 
| sparey diesel | 
| "sparey diesel" | 
Ticking the Checkbox for all terms will enable AND logic.
  This means that a page must contain all of the terms you specify in order 
  to qualify. Note that even with AND logic, unless you use phrases as shown
  above, you are only guarenteed that returned pages contain all the terms
  somewhere without any regard to order, or proxmity.  So using the first
  example shown above with the "all" checkbox ticked will return the pages
  that contain both words sparey and diesel on them
  someplace, at least once each, with no regard to where or when they appear.
  Naturally, AND searches will take longer.
   In order to give back results as fast as possible, the pages found are 
  simply sorted alphabetically.  However, you have the option to "rank" the
  results by checking the second Checkbox.  This causes the  listed result
  pages to be ordered using a technique knows as
  "term weighted Inverse Doucment Frequency" (IDF). 
   Briefly, this technique assumes that terms that do not appear very
  often on any page will be more interesting than those that appear
  frequently on lots of pages, hence these terms are "weighted" to rank
  their page(s) higher.  So a page with a few interesting words may be
  ranked higher than one with lots of common words.
   Each record is allocated a "score", being the sum of the number
  of times each term occurs on the page, times the IDF value for that term.
  The number displayed for each result is a percentage value calculated
  derived by dividing the page's score by the maximum score
  recorded for all of the pages found. Hence the one at the top will be 100%
  and so on down.  I think you can guess that this takes longer--and a ranked,
  AND logic query will take longest of all.
   Searches are not case sensitive, so the terms Sparey and
  sparey (or SPAREY) will all return the same result. 
  Apart from rampaging feature-itus, I can see no real benefit in providing
  an option for a case sensitive search. Please let me know if you can
  provide a scenario where this would help.
   By default, term and phrase searches are made without consideration
  for word boundaries, so a search for carb would match things like
  carb, carburator, carbon and bicarbonate!!
  Checking the "Force whole word match" box will restrict matches to whole
  word boundaries (surprise).  The cost, time-wise, is not a lot more.  I've
  made the default "match partials" to increase the chance of hits.  Turn it
  off if you're being deluged with false positives.
   For the techos, I must confess that the term parsing is rather simplistic
  and I imagine it will be possible to break it in several ways.  Under the
  hood, the work is being done by a CGI Perl script using REs. If you don't
  know a RE from a hole in the road, consider yourself fortunate, and possibly
  still relatively sane.  If you do, you can enter one as a term and really
  cause chaos with my poor script! (although several of the RE metacharacters
  are escaped--someone searching for an "ETA .29" really wants ".29", not any
  character followed by "29", right?)
   Currently, this site is hosted on an old Solaris box with an old Perl
  implementation. Tests have shown a modern Linux server runs the search an
  order of magnitude faster--I suspect that file globbing and lack of support
  for pre-compiled REs are the culprits.  Sad, but that's life.
   Finally, if you manage to break the script in some way, indicated by
  an Apache error page being returned that mumbles about
  "internal server error"
  and suggesting that you contact the webmaster,
  please contact me--not our poor,
  besieged Web Mistress (or you'll be sorry Ranked Searches
  Case Sensitivity
  Word Matching
  Under the Hood
  Oops, I broke it!
   ).
).
 
