Class NLGUtils

java.lang.Object
com.articulate.sigma.nlg.NLGUtils
All Implemented Interfaces:
Serializable

public class NLGUtils extends Object implements Serializable
Utilities and variables used by LanguageFormatter and other NLG classes.
See Also:
  • Field Details

    • outputMap

      public static Map<String,edu.stanford.nlp.ling.CoreLabel> outputMap
    • debug

      public static boolean debug
  • Constructor Details

    • NLGUtils

      public NLGUtils()
  • Method Details

    • init

      public static void init(String kbDir)
    • encoder

      public static void encoder(Object object)
    • decoder

      public static <T> T decoder()
    • serializedExists

      public static boolean serializedExists()
      Check whether sources are newer than serialized version.
    • serializedOld

      public static boolean serializedOld()
      Check whether sources are newer than serialized version.
    • loadSerialized

      public static void loadSerialized()
      Load the most recently save serialized version.
    • serialize

      @Deprecated(since="OCT 2024", forRemoval=true) public static void serialize()
      Deprecated, for removal: This API element is subject to removal in a future version.
      as of OCT 2024, use encoder(java.lang.Object) instead
      save serialized version.
    • resolveFormatSpecifiers

      public static String resolveFormatSpecifiers(String template, String href)
      Resolve the "format specifiers" in the given printf type of statement.
      Parameters:
      template -
      href -
      Returns:
    • formatList

      public static String formatList(String strseq, String language)
      Format a list of variables which are not enclosed by parens. Formatting includes inserting the appropriate separator between the elements (usually a comma), as well as inserting the conjunction ("and" or its equivalent in another language) if the conjunction doesn't already exist.
      Parameters:
      strseq - the list of variables
      language - the target language (used for the conjunction "and")
      Returns:
      the formatted string
    • readKeywordMap

      public static void readKeywordMap(String dir)
      Read a set of standard words and phrases in several languages. Each phrase must appear on a new line with alternatives separated by '|'. The first entry should be a set of two letter language identifiers. Creates a HashMap of HashMaps where the first HashMap has a key of the English phrase, and the interior HashMap has a key of the two letter language identifier.
    • getArticle

      public static String getArticle(String s, int count, int occurrence, String language)
      Generate a linguistic article appropriate to how many times in a paraphrase a particular type has already occurred.
      Parameters:
      count - is the number of times a given variable has appeared
      occurrence - is the number of times a variables of a given type have appeared
    • getKeywordMap

      public static Map<String,Map<String,String>> getKeywordMap()
    • setKeywordMap

      public static void setKeywordMap(Map<String,Map<String,String>> themap)
    • getKeyword

      public static String getKeyword(String englishWord, String language)
    • htmlParaphrase

      public static String htmlParaphrase(String href, String stmt, Map<String,String> phraseMap, Map<String,String> termMap, KB kb, String language)
      Hyperlink terms in a natural language format string. This assumes that terms to be hyperlinked are in the form invalid input: '&'%termName$termString , where termName is the name of the term to be browsed in the knowledge base and termString is the text that should be displayed hyperlinked.
      Parameters:
      href - the anchor string up to the term= parameter, which this method will fill in.
      stmt - the KIF statement that will be passed to paraphraseStatement for formatting.
      phraseMap - the set of NL formatting statements that will be passed to paraphraseStatement.
      termMap - the set of NL statements for terms that will be passed to paraphraseStatement.
      language - the natural language in which the paraphrase should be generated.
    • expandStar

      public static String expandStar(Formula f, String strFormat, String lang)
      This method expands all "star" (asterisk) directives in the input format string, and returns a new format string with individually numbered argument pointers.
      Parameters:
      f - The Formula being paraphrased.
      strFormat - The format string that contains the patterns and directives for paraphrasing f.
      lang - A two-character string indicating the language into which f should be paraphrased.
      Returns:
      A format string with all relevant argument pointers expanded.
    • upcaseFirstVisibleChar

      public static String upcaseFirstVisibleChar(String htmlParaphrase, boolean addFullStop, String language)
      Capitalizes the first visible char of htmlParaphrase, if possible, and adds the full stop symbol for language at a workable place near the end of htmlParaphrase if addFullStop is true.
      Parameters:
      htmlParaphrase - Any String, but assumed to be a Formula paraphrase with HTML markup
      addFullStop - If true, this method will try to add a full stop symbol to the result String.
      language - The language of the paraphrase String.
      Returns:
      String
    • containsProcess

      public static boolean containsProcess(Collection<String> vals, KB kb)
      Return true if the given list includes "Process", or if one of its elements is a subclass of Process.
    • formatLongUrl

      public static String formatLongUrl(String quotedUrl)
      Insert spaces into long URLs to improve readability in NL output.