`lexnlp.nlp.en.segments.titles`: Segmenting and identifying titles in text¶

The lexnlp.nlp.en.segments.titles module contains methods for identifying titles and segmenting text between zero or more titles.

Attention

The sections below are a work in progress. Thank you for your patience while we continue to expand and improve our documentation coverage.

If you have any questions in the meantime, please feel free to log issues on GitHub at the URL below or contact us at the email below:

lexnlp.nlp.en.segments.titles Module¶

Title segmentation for English.

This module implements title segmentation/location in English using simple machine learning classifiers.

`build_document_line_distribution`(text[, …])	Build document and line character distribution for section segmenting based on fixed character, optionally normalizing vector.
`build_document_title_features`(text[, …])	Get a document title given file text.
`build_model`(training_file_path)	Build a title extraction model given a training file path.
`build_title_features`(lines, line_id, …[, …])	Build a feature vector for a given line ID with given parameters.
`get_titles`((text[, window_pre, window_post, …])	Get titles from text.

`MODULE_PATH`	str(object=’‘) -> str
`SECTION_SEGMENTER_MODEL`	An extra-trees classifier.