About me

About me

Feeds

RSS feed

Spelling checker using a b-tree

12th October 2015

For several of my projects I wanted a simple spelling checker, that would check each word against a wordlist, and simply return t if the word was in the wordlist, or nil otherwise.

For efficiency I decided to represent the dictionary as a b-tree; here's a description of the Lisp program to implement the whole spellchecker application.

Adding a word to the dictionary

The dictionary is stored in *dict*:

(defparameter *dict* nil)

Here's the routine to add a word to the dictionary:

(defun %add-word (word tree)
  (let* ((char1 (char-downcase (elt word 0)))
         (len (length word)))
    (flet ((findchar (x) (and (listp x) (eq (car x) char1))))
      (if (= len 1)
        (if (find char1 tree) tree `(,char1 ,@tree))
        ;;
        ;; length word > 1
        (let ((it (find-if #'findchar tree)))
          (if it
            `((,char1 ,@(%add-word (subseq word 1) (cdr it)))
              ,@(remove-if #'findchar tree)) 
            `((,char1 ,@(%add-word (subseq word 1) nil)) ,@tree)))))))

This is called by add-word:

(defun add-word (word)
  (setq *dict* (%add-word word *dict*)))

For example, after executing:

 (map nil #'add-word '("one" "two" "three" "four" "five" "six" "seven" "eight"))

we have:

CL-USER >*dict*
((#\e (#\i (#\g (#\h #\t)))) (#\s (#\e (#\v (#\e #\n))) (#\i #\x))
 (#\f (#\i (#\v #\e)) (#\o (#\u #\r))) (#\t (#\h (#\r (#\e #\e)))
 (#\w #\o)) (#\o (#\n #\e))

Looking up a word

Here's the routine find-word to look up a word in the dictionary. It uses this helper function:

(defun %find-word (word tree)
  (if (zerop (length word)) t
      (let* ((char1 (char-downcase (elt word 0))))
        (if
          (= (length word) 1)
          (find char1 tree)
          ;;
          ;; length word > 1
          (let ((it
                 (dolist (x tree)
                   (when (and (listp x) (eq (car x) char1)) (return x)))))
            (when it (%find-word (subseq word 1) (cdr it))))))))

Here's the routine itself:

(defun find-word (word)
  (%find-word word *dict*))
For example:
CL-USER > (find-word "ten")
NIL

CL-USER > (find-word "three")
#\e

Deleting a word from the dictionary

This is the routine to delete a word from the dictionary:

(defun %delete-word (word tree)
  (let* ((char1 (char-downcase (elt word 0))))
    (flet ((findchar (x) (and (listp x) (eq (car x) char1))))
      (if
          (= (length word) 1)
          (if (find char1 tree) (remove char1 tree) tree)
        ;;
        ;; length word > 1
        (let ((it (find-if #'findchar tree)))
          (if it
              `((,char1 ,@(%delete-word (subseq word 1) (cdr it)))
                ,@(remove-if #'findchar tree)) 
            `((,char1 ,@(%delete-word (subseq word 1) nil)) ,@tree)))))))

This is called with:

(defun delete-word (word)
  (cond
   ((%find-word word *dict*) (setq *dict* (%delete-word word *dict*)) t)
    (t nil)))

Applying a function to all words in the dictionary

Finally a routine to apply a function to each of the words in the dictionary:

(defun %map-words (tree path function)
  (cond
   ((atom tree)
    (funcall function (map 'string #'identity (reverse (cons tree path)))))
   (t (dolist (x (cdr tree))
        (%map-words x (cons (car tree) path) function)))))

This is called with:

(defun map-words (function dictionary)
  (dolist (x dictionary) (%map-words x nil function)))

For example:

CL-USER > (map-words #'print *dict*)

"eight" 
"seven" 
"six" 
"five" 
"four" 
"three" 
"two" 
"one" 
NIL

blog comments powered by Disqus