lexnlp.extract.common.definitions package¶
Submodules¶
lexnlp.extract.common.definitions.common_definition_patterns module¶
-
class
lexnlp.extract.common.definitions.common_definition_patterns.
CommonDefinitionPatterns
¶ Bases:
object
-
static
collect_regex_matches
(phrase: str, reg: <module 'regex' from '/home/docs/checkouts/readthedocs.org/user_builds/lexpredict-lexnlp/envs/1.7.0/lib/python3.6/site-packages/regex/__init__.py'>, prob: int, def_start: Callable[[str, Match[~AnyStr]], int], def_end: Callable[[str, Match[~AnyStr]], int]) → List[lexnlp.extract.common.pattern_found.PatternFound]¶ find all matches by ‘reg’ ptr :param quoted_def_start: (phrase, match, quoted_match) -> definition’s start :param quoted_def_end: (phrase, match, quoted_match) -> definition’s end :param def_start: (phrase, match) -> definition’s start :param def_end: (phrase, match) -> definition’s end :return:
-
static
collect_regex_matches_with_quoted_chunks
(phrase: str, reg: <module 'regex' from '/home/docs/checkouts/readthedocs.org/user_builds/lexpredict-lexnlp/envs/1.7.0/lib/python3.6/site-packages/regex/__init__.py'>, prob: int, quoted_def_start: Callable[[str, Match[~AnyStr], Match[~AnyStr]], int], quoted_def_end: Callable[[str, Match[~AnyStr], Match[~AnyStr]], int], def_start: Callable[[str, Match[~AnyStr]], int], def_end: Callable[[str, Match[~AnyStr]], int]) → List[lexnlp.extract.common.pattern_found.PatternFound]¶ First, find all matches by ‘reg’ ptr Second, go through matches For each match try to find a set of quoted words If found, use them as matches Or use the whole match :param quoted_def_start: (phrase, match, quoted_match) -> definition’s start :param quoted_def_end: (phrase, match, quoted_match) -> definition’s end :param def_start: (phrase, match) -> definition’s start :param def_end: (phrase, match) -> definition’s end :return:
-
static
get_acronym_words_start
(phrase: str, match: Match[~AnyStr]) → int¶ each acronym match should be preceded by capitalized words that start from the same letters :param phrase: “rompió el silencio tras ser despedido del Canal del Fútbol (CDF). ” :param match: “(CDF)” Match object for this example :return: start letter (42 for this case) index or -1
-
static
match_acronyms
(phrase: str) → List[lexnlp.extract.common.pattern_found.PatternFound]¶ Parameters: phrase – rompió el silencio tras ser despedido del Canal del Fútbol (CDF). Returns: {name: ‘CDF’, probability: 100, …}
-
static
match_es_def_by_semicolon
(phrase: str) → List[lexnlp.extract.common.pattern_found.PatternFound]¶ Parameters: phrase – “Modern anatomy human”: a human of modern anatomy. Returns: {name: ‘Modern anatomy human’, probability: 100, …}
-
static
peek_quoted_part
(phrase: str, match: Match[~AnyStr], start_func: Callable[[str, Match[~AnyStr], Match[~AnyStr]], int], end_func: Callable[[str, Match[~AnyStr], Match[~AnyStr]], int], match_prob: int) → List[lexnlp.extract.common.pattern_found.PatternFound]¶ Parameters: - phrase – the whole text, may be used for getting the definition’s text length
- match – the matched part of the phrase that may contain several quote-packed definitions
- start_func – (phrase, match, quoted_match) -> definition’s start
- end_func – (phrase, match, quoted_match) -> definition’s end
- match_prob – definition’s probability
Returns: a list of definitions found or an empty list
-
reg_acronyms
= regex.Regex('\\(\\p{Lu}\\p{L}*\\p{Lu}\\)', flags=regex.V0)¶
-
reg_quoted
= regex.Regex('(["\'“„])(?:(?=(\\\\?))\\2.)*?\\1', flags=regex.I | regex.V0)¶
-
reg_semicolon
= regex.Regex('(["\'“„])(?:(?=(\\\\?))\\2.)*?\\1(?=:)', flags=regex.I | regex.V0)¶
-
static
lexnlp.extract.common.definitions.definition_match module¶
-
class
lexnlp.extract.common.definitions.definition_match.
DefinitionMatch
¶ Bases:
object
used inside EsDefinitionsParser and SpanishParsingMethods to store intermediate parsing results
lexnlp.extract.common.definitions.universal_definition_parser module¶
-
class
lexnlp.extract.common.definitions.universal_definition_parser.
UniversalDefinitionsParser
(parsing_functions: List[Callable[str, List[lexnlp.extract.common.pattern_found.PatternFound]]], split_params: lexnlp.utils.lines_processing.line_processor.LineSplitParams)¶ Bases:
lexnlp.extract.common.text_pattern_collector.TextPatternCollector
EsDefinitionsParser searches for definitions in text according to the rules of Spanish. See the “parse” method
-
get_definition_dictionaries
()¶
-
make_annotation_from_pattrn
(locale: str, ptrn: lexnlp.extract.common.pattern_found.PatternFound, phrase: lexnlp.utils.lines_processing.line_processor.LineOrPhrase) → lexnlp.extract.common.annotations.text_annotation.TextAnnotation¶
-