Class MultiWords

java.lang.Object
com.articulate.sigma.wordNet.MultiWords
All Implemented Interfaces:
Serializable

public class MultiWords extends Object implements Serializable
See Also:
  • Field Details

    • multiWordSerialized

      public Map<String,Set<String>> multiWordSerialized
      A Multimap of String keys and String values. The String key is the first word of a multi-word WordNet "word", such as "table_tennis", where words are separated by underscores. The values are the whole multi-word. The same head word can appear in many multi-words.
    • debug

      public static boolean debug
  • Constructor Details

    • MultiWords

      public MultiWords()
  • Method Details

    • addMultiWord

      public void addMultiWord(String word, char wordDelimit)
      Add a multi-word string to the multiWord member variable. Convert the wordDelimit to underscores
    • addMultiWord

      public void addMultiWord(String word)
      Add a multi-word string to the multiWord member variable.
    • findMultiWord

      public String findMultiWord(List<String> text)
    • findMultiWord

      public int findMultiWord(List<String> text, int startIndex, List<String> synset)
      Find the synset for a multi-word string, if it exists.
      Parameters:
      text - is an array of String words.
      startIndex - is the first word in the array to look at
      synset - is an array of only one element, if a synset is found and empty otherwise
      Returns:
      the index into the next word to be checked, in text, which could be the same as startIndex, if no multi-word was found
    • findMultiWord

      public int findMultiWord(String multiWordKey, String nonRoot, List<String> multiWordTail, List<String> synset)
      Parameters:
      nonRoot - is the non root form of the potential multiword headword. We need to try both the root form and the original form, which includes capitalized and lower case versions.
    • rootFormOf

      public static String rootFormOf(String word)