lexnlp.nlp.en.transforms package

Submodules

lexnlp.nlp.en.transforms.characters module

Transforms related to characters for English

lexnlp.nlp.en.transforms.characters.get_character_distribution(text, lowercase=False, stopword=False)

Get character distribution of text, potentially lowercasing and stopwording first. N.B. This method does not include or count whitespace.

Parameters:
  • text
  • lowercase
  • stopword
Returns:

lexnlp.nlp.en.transforms.characters.get_character_ngram_distribution(text, n, lowercase=False, stopword=False)

Get character distribution of text, potentially lowercasing and stopwording first. N.B. This method does not include or count whitespace.

Parameters:
  • text
  • lowercase
  • stopword
Returns:

lexnlp.nlp.en.transforms.tokens module

Transforms related to tokens for English

lexnlp.nlp.en.transforms.tokens.get_bigram_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]

Get bigram distribution from text. :param text: :param lowercase: :param stopword: :return:

lexnlp.nlp.en.transforms.tokens.get_ngram_distribution(text: str, n: int, lowercase=False, stopword=False) → Dict[str, int]

Get n-gram distribution of text, potentially lowercasing and stopwording first.

lexnlp.nlp.en.transforms.tokens.get_skipgram_distribution(text: str, n: int, k: int, lowercase=False, stopword=False) → Dict[str, int]

Get skipgram distribution from text.

Parameters:
  • text
  • n
  • k
  • lowercase
  • stopword
Returns:

lexnlp.nlp.en.transforms.tokens.get_stem_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]

Get stemmed token distribution of text, potentially lowercasing and stopwording first.

lexnlp.nlp.en.transforms.tokens.get_token_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]

Get token distribution of text, potentially lowercasing and stopwording first.

lexnlp.nlp.en.transforms.tokens.get_trigram_distribution(text: str, lowercase=False, stopword=False) → Dict[str, int]

Get trigram distribution from text. :param text: :param lowercase: :param stopword: :return:

Module contents