Class TernaryTree

java.lang.Object
org.passay.dictionary.TernaryTree

public class TernaryTree extends Object
Implementation of a ternary tree. Methods are provided for inserting strings and searching for strings. The algorithms in this class are all recursive, and have not been optimized for any particular purpose. Data which is inserted is not sorted before insertion, however data can be inserted beginning with the median of the supplied data.
  • Field Details

    • CASE_SENSITIVE_COMPARATOR

      protected static final Comparator<String> CASE_SENSITIVE_COMPARATOR
      Case sensitive comparator.
    • CASE_INSENSITIVE_COMPARATOR

      protected static final Comparator<String> CASE_INSENSITIVE_COMPARATOR
      Case insensitive comparator.
    • LINE_SEPARATOR

      private static final String LINE_SEPARATOR
      File system line separator.
    • EMPTY_ARRAY

      private static final String[] EMPTY_ARRAY
      An empty results array.
    • comparator

      protected final Comparator<String> comparator
      Character comparator.
    • root

      private TernaryNode root
      root node of the ternary tree.
  • Constructor Details

    • TernaryTree

      public TernaryTree()
      Creates an empty case-sensitive ternary tree.
    • TernaryTree

      public TernaryTree(boolean caseSensitive)
      Creates an empty ternary tree with the given case sensitivity.
      Parameters:
      caseSensitive - whether this ternary tree should be case-sensitive.
  • Method Details

    • insert

      public void insert(CharSequence word)
      Inserts the supplied word into this tree.
      Parameters:
      word - to insert
    • insert

      public void insert(CharSequence[] words)
      Inserts the supplied array of words into this tree.
      Parameters:
      words - to insert
    • search

      public boolean search(CharSequence word)
      Returns whether the supplied word has been inserted into this ternary tree.
      Parameters:
      word - to search for
      Returns:
      whether the word was found
    • partialSearch

      public CharSequence[] partialSearch(CharSequence word)
      Returns an array of strings which partially match the supplied word. word should be of the format '.e.e.e' Where the '.' character represents any valid character. Possible results from this query include: Helene, delete, or severe Note that no substring matching occurs, results only include strings of the same length. If the supplied word does not contain the '.' character, then a regular search is performed.

      NOTE This method is not supported for case-insensitive ternary trees. Since the tree is built without regard to case any words returned from the tree may or may not match the case of the supplied word.

      Parameters:
      word - to search for
      Returns:
      array of matching words
      Throws:
      UnsupportedOperationException - if this is a case-insensitive ternary tree
    • nearSearch

      public CharSequence[] nearSearch(CharSequence word, int distance)
      Return an array of strings which are near to the supplied word by the supplied distance. For the query nearSearch("fisher", 2): Possible results include: cipher, either, fishery, kosher, sister. If the supplied distance is not > 0, then a regular search is performed.

      NOTE This method is not supported for case-insensitive ternary trees. Since the tree is built without regard to case any words returned from the tree may or may not match the case of the supplied word.

      Parameters:
      word - to search for
      distance - for valid match
      Returns:
      array of matching words
      Throws:
      UnsupportedOperationException - if this is a case-insensitive ternary tree
    • getWords

      public List<CharSequence> getWords()
      Returns a list of all the words in this ternary tree. This is a very expensive operation, every node in the tree is traversed. The returned list cannot be modified.
      Returns:
      unmodifiable list of words
    • print

      public void print(Writer out, boolean fullPath) throws IOException
      Prints an ASCII representation of this ternary tree to the supplied writer. This is a very expensive operation, every node in the tree is traversed. The output produced is hard to read, but it should give an indication of whether your tree is balanced.
      Parameters:
      out - to print to
      fullPath - specifies whether each line should show the full path from root or only the suffix
      Throws:
      IOException - if an error occurs
    • print

      public void print(Writer out) throws IOException
      Prints an ASCII representation of this ternary tree to the supplied writer. This is a very expensive operation, every node in the tree is traversed. The output produced is hard to read, but it should give an indication of whether your tree is balanced.
      Parameters:
      out - to print to
      Throws:
      IOException - if an error occurs
    • insertNode

      private TernaryNode insertNode(TernaryNode node, CharSequence word, int index)
      Recursively inserts a word into the ternary tree one node at a time beginning at the supplied node.
      Parameters:
      node - to put character in
      word - to be inserted
      index - of character in word
      Returns:
      ternary node to insert
    • searchNode

      private boolean searchNode(TernaryNode node, int[] word, int index)
      Recursively searches for a word in the ternary tree one node at a time beginning at the supplied node.
      Parameters:
      node - to search in
      word - to search for
      index - of character in word
      Returns:
      whether the word was found
    • partialSearchNode

      private List<CharSequence> partialSearchNode(TernaryNode node, List<CharSequence> matches, CharSequence match, CharSequence word, int index)
      Recursively searches for a partial word in the ternary tree one node at a time beginning at the supplied node.
      Parameters:
      node - to search in
      matches - of partial matches
      match - the current word being examined
      word - to search for
      index - of character in word
      Returns:
      list of matches
    • nearSearchNode

      private List<CharSequence> nearSearchNode(TernaryNode node, int distance, List<CharSequence> matches, CharSequence match, CharSequence word, int index)
      Recursively searches for a near match word in the ternary tree one node at a time beginning at the supplied node.
      Parameters:
      node - to search in
      distance - of a valid match, must be > 0
      matches - list of near matches
      match - the current word being examined
      word - to search for
      index - of character in word
      Returns:
      list of matches
    • traverseNode

      private List<CharSequence> traverseNode(TernaryNode node, String s, List<CharSequence> words)
      Recursively traverses every node in the ternary tree one node at a time beginning at the supplied node. The result is a string representing every word, which is delimited by the LINE_SEPARATOR character.
      Parameters:
      node - to begin traversing
      s - string of words found at the supplied node
      words - which will be returned (recursive function)
      Returns:
      list of strings containing all words from the supplied node
    • printNode

      private void printNode(TernaryNode node, String s, int depth, boolean fullPath, StringBuilder buffer)
      Recursively traverses every node in the ternary tree rooted at the supplied node. The result is an ASCII string representation of the tree rooted at the supplied node.
      Parameters:
      node - to begin traversing
      s - the string representation of the current chain of equal kid nodes
      depth - depth of the current chain of nodes
      fullPath - specifies whether each line should show the full path from root or only the suffix
      buffer - the buffer to which the output is printed
    • getNodeStats

      private Map<Integer,Integer> getNodeStats(TernaryNode node, int depth, Map<Integer,Integer> histogram)
      Returns a histogram of how many words end at each depth.
      Parameters:
      node - the node at the root of the tree to count
      depth - the depth of the node from root
      histogram - the depth count histogram to update
      Returns:
      a histogram of how many words end at each depth
    • getNodeStats

      protected Map<Integer,Integer> getNodeStats()
      Returns a histogram of how many words end at each depth.
      Returns:
      a histogram of how many words end at each depth