lexnlp.extract.es package

Submodules

lexnlp.extract.es.copyrights module

class lexnlp.extract.es.copyrights.CopyrightEsParser

Bases: lexnlp.extract.common.copyrights.copyright_en_style_parser.CopyrightEnStyleParser

classmethod extract_phrases_with_coords(sentence: str) → List[Tuple[str, int, int]]
static init_parser()
line_processor = <lexnlp.utils.lines_processing.line_processor.LineProcessor object>
lexnlp.extract.es.copyrights.get_copyrights(text: str, return_sources=False) → Generator[[dict, None], None]

lexnlp.extract.es.courts module

Court extraction for Spanish.

This module implements extraction functionality for courts in Spain, including formal names, abbreviations, and aliases.

lexnlp.extract.es.courts.get_court_annotations(text: str, language: str = None) → Generator[[dict, None], None]
lexnlp.extract.es.courts.get_courts()

See lexnlp/extract/en/tests/test_courts.py

lexnlp.extract.es.courts.setup_es_parser()

lexnlp.extract.es.dates module

lexnlp.extract.es.definitions module

class lexnlp.extract.es.definitions.SpanishParsingMethods

Bases: object

the class contains methods with the same signature:
def method_name(phrase: str) -> List[DefinitionMatch]:

the methods are used for finding definition “candidates”

static match_es_def_by_hereafter(phrase: str) → List[lexnlp.extract.common.pattern_found.PatternFound]
Parameters:phrase – las instrucciones de uso o instalación del software o todas las descripciones de uso del mismo (de aquí en adelante, la “Documentación”);
Returns:{name: ‘Documentación’, probability: 100, …}
static match_es_def_by_reffered(phrase: str) → List[lexnlp.extract.common.pattern_found.PatternFound]
Parameters:phrase – En este acuerdo, el término “Software” se refiere a: (i) el programa informático que acompaña a este Acuerdo y todos sus componentes;
Returns:definitions (objects)
static match_first_word_is(phrase: str) → List[lexnlp.extract.common.pattern_found.PatternFound]
Parameters:phrase – El tabaquismo es la adicción al tabaco, provocada principalmente.
Returns:definitions (objects)
reg_first_word_is = re.compile('^.+?(?=es\\s+\\w+\\W+\\w+|está\\s+\\w+\\W+\\w+)')
reg_hereafter = re.compile('(?<=(en adelante[,\\s]))[\\w\\s*\\"*]+')
reg_reffered = re.compile('^.+(?=se refiere)')
lexnlp.extract.es.definitions.get_definition_annotations(text: str, language=None) → Generator[[lexnlp.extract.common.annotations.definition_annotation.DefinitionAnnotation, None], None]
lexnlp.extract.es.definitions.get_definition_list(text: str, language=None) → List[lexnlp.extract.common.annotations.definition_annotation.DefinitionAnnotation]
lexnlp.extract.es.definitions.get_definitions(text: str, language=None) → Generator[[dict, None], None]
lexnlp.extract.es.definitions.make_es_definitions_parser()

lexnlp.extract.es.language_tokens module

class lexnlp.extract.es.language_tokens.EsLanguageTokens

Bases: object

Spanish parts of speech, used in a number of parsing methods

abbreviations = {'abs.', 'act.', 'inc.', 'no.', 'nr.', 'p.'}
articles = ['el', 'la', 'los', 'las']
conjunctions = ['und', 'oder']

lexnlp.extract.es.regulations module

class lexnlp.extract.es.regulations.RegulationsParser(regulations_dataframe: pandas.core.frame.DataFrame = None)

Bases: object

Parses Spanish regulations (acts, institutions and so on): - “la emisión de instrumentos inscritos en el Registro Nacional de Valores, colocados”

boils down to ‘Registro Nacional de Valores’
  • expects words like ‘registro’, ‘comisión’, ‘comision’, ‘ley del’ that open the following phrase
get_annotations_as_dictionaries() → List
load_trigger_words() → None
match_start_trigger(phrase: str) → None
Parameters:phrase – mediante la emisión de instrumentos inscritos en el Registro Nacional de Valores, colocados
Returns:{name: ‘Registro Nacional de Valores’, probability: 100, …}
parse(text: str, locale: str = None) → List[lexnlp.extract.common.annotations.regulation_annotation.RegulationAnnotation]
setup_regexes() → None
trim_annotations() → None
lexnlp.extract.es.regulations.get_regulation_annotations(text: str, language: str = None) → Generator[[lexnlp.extract.common.annotations.regulation_annotation.RegulationAnnotation, None], None]
lexnlp.extract.es.regulations.get_regulation_list(text: str, language: str = None) → List[lexnlp.extract.common.annotations.regulation_annotation.RegulationAnnotation]
lexnlp.extract.es.regulations.get_regulations(text: str, language: str = None) → Generator[[dict, None], None]
lexnlp.extract.es.regulations.make_de_regulations_parser()

Module contents