lexnlp.extract.ml.en.definitions package

Submodules

lexnlp.extract.ml.en.definitions.definition_phrase_detector module

class lexnlp.extract.ml.en.definitions.definition_phrase_detector.DefinitionPhraseDetector

Bases: lexnlp.extract.ml.detector.artifact_detector.ArtifactDetector

Search for the phrase surrounding the term being defined

Let the prase be <agrees to serve the Company in such capacity during the term of employment (the “Employment Period”).

… model_definition will find <term of employment (the “Employment Period”)>

process_sample(sample_df: pandas.core.frame.DataFrame, build_target_data: bool = False) → Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]]
train_and_save(settings: lexnlp.extract.ml.detector.detecting_settings.DetectingSettings, train_file: str, train_size: int = -1, save_path: str = '', compress: bool = False) → None

Create a percent identification model using tokens. :param settings: Model settings :param train_file: File to load training samples from :param train_size: Number of records to use :param save_path: Output (pickle model) file path :param compress: Save compressed file

train_and_save_on_dataframe(settings: lexnlp.extract.ml.detector.detecting_settings.DetectingSettings, train_sample_df: pandas.core.frame.DataFrame, save_path: str = '', compress: bool = False) → None

lexnlp.extract.ml.en.definitions.definition_term_detector module

class lexnlp.extract.ml.en.definitions.definition_term_detector.DefinitionTermDetector

Bases: lexnlp.extract.ml.detector.artifact_detector.ArtifactDetector

process_sample(sample_df: pandas.core.frame.DataFrame, build_target_data: bool = False) → Union[numpy.ndarray, Tuple[numpy.ndarray, numpy.ndarray]]
train_and_save(settings: lexnlp.extract.ml.detector.detecting_settings.DetectingSettings, train_file: str, train_size: int = -1, save_path: str = '', compress: bool = False) → None

Create a percent identification model using tokens. :param settings: Model settings :param train_file: File to load training samples from :param train_size: Number of records to use :param save_path: Output (pickle model) file path :param compress: Save compressed file

train_and_save_on_dataframe(settings: lexnlp.extract.ml.detector.detecting_settings.DetectingSettings, train_sample_df: pandas.core.frame.DataFrame, save_path: str = '', compress: bool = False) → None

lexnlp.extract.ml.en.definitions.layered_definition_detector module

class lexnlp.extract.ml.en.definitions.layered_definition_detector.LayeredDefinitionDetector

Bases: object

get_annotations(sentence: str) → List[lexnlp.extract.common.annotations.definition_annotation.DefinitionAnnotation]
static join_adjacent_definitions_labels(labels_definitions, labels_terms, row_text)
load_compressed(file_path: str)

Loads archive with two model pickle files (model_definition, model_term)

train_on_doccano_jsonl(save_file_path: str, exported_doc_path: str, text_column_name: str = 'text', labels_column_name: str = 'labels', label_term: str = 'term', label_definition: str = 'definition')
train_on_formatted_data(definition_frame: pandas.core.frame.DataFrame, term_frame: pandas.core.frame.DataFrame, save_file_path: str)
Parameters:
  • definition_frame – dataframe, [ (row_text, [(start, end), (start, end)…], feature_mask]
  • term_frame – dataframe, [ (row_text, [(start, end), (start, end)…]]
  • save_file_path – path to store zipped model files (as one file)

Module contents