lexnlp.nlp.en.segments.sections
: Segmenting sentences in text¶
The lexnlp.nlp.en.segments.sentences
module contains methods for segmenting text
into zero or more sentences.
Attention
The sections below are a work in progress. Thank you for your patience while we continue to expand and improve our documentation coverage.
If you have any questions in the meantime, please feel free to log issues on GitHub at the URL below or contact us at the email below:
- GitHub issues: https://github.com/LexPredict/lexpredict-lexnlp
- Email: support@contraxsuite.com
lexnlp.nlp.en.segments.sentences Module¶
Sentence segmentation for English.
This module implements sentence segmentation in English using simple machine learning classifiers.
- Todo:
- Standardize model (re-)generation
Functions¶
build_sentence_model (text[, extra_abbrevs]) |
Build a sentence model from text with optional extra abbreviations to include. |
get_sentence_list (text) |
Get sentences from text. |
get_sentence_span_list (…) |
Given a text, returns a list of the (start, end) spans of sentences in the text. |
Variables¶
MODULE_PATH |
str(object=’‘) -> str |
SENTENCE_SEGMENTER_MODEL |
A sentence tokenizer which uses an unsupervised algorithm to build a model for abbreviation words, collocations, and words that start sentences; and then uses that model to find sentence boundaries. |
extra_abbreviations |
list() -> new empty list |