Interface Categorizer

  • All Known Implementing Classes:
    CategorizerImpl

    public interface Categorizer
    Categorizer class aims to recognize the category of the sentence based on the training Model used. The data used to train the model is based on the categories_[lang].txt The file is dynamic and the admin can download the file, edit and save it back to the ressources folder.
    • Method Detail

      • getInstance

        static CategorizerImpl getInstance()
        gets the instance of the class, if the instance is null it calls the constructor to create a new instance
        Returns:
        Categorizer the instance of the class.
      • train

        opennlp.tools.doccat.DoccatModel train​(String lang)
        This method takes a lot of time to execute since it prepares the training Model for the categorizer based on the training Data provided in the CategorizerImpl() constructor} For optimization purposes, we Have used a singleton
        Parameters:
        lang - a string value representing the language in which the model will be trained. It also references the file to be used to train the Model
        Returns:
        instance of DoccatModel which is the trained model.
        See Also:
        ObjectStream, DoccatModel, TrainingParameters, DoccatFactory, DocumentSampleStream, DocumentSample, DocumentCategorizerME, MarkableFileInputStreamFactory, PlainTextByLineStream