lexnlp.extract.en.conditions: Extracting conditional statements

The lexnlp.extract.en.conditions module contains methods that allow for the extraction of conditional statements from text. Statements that are covered by default in this module are:

  • if

  • if not

  • when

  • when not

  • where

  • where not

  • unless and until

  • unless

  • unless not

  • until

  • until not

  • as soon as

  • as soon as not

  • provided that

  • provided that not

  • subject to

  • not subject to

  • upon the occurrence

  • subject to

  • conditioned on

  • conditioned upon

The full list of current unit test cases can be found here: https://github.com/LexPredict/lexpredict-lexnlp/tree/master/test_data/lexnlp/extract/en/tests/test_conditions

Extracting conditions

Example

>>> import lexnlp.extract.en.conditions
>>> text = "This will occur unless something else happens."
>>> print(list(lexnlp.extract.en.conditions.get_conditions(text)))
[('unless and until', 'This will occur', '')]

>>> import lexnlp.extract.en.conditions
>>> text = "Immediately upon the occurrence of a Change in Control of the Company or the Bank, the Employee shall be paid $125,000.00."
>>> print(list(lexnlp.extract.en.conditions.get_conditions(text)))
[('upon the occurrence', 'Immediately', '')]

Customizing conditional statement extraction

Conditional statement extraction can be customized. There are two key module variables that store the default configuration and one function used to create a matching instance:

  • CONDITION_PHRASES: This List stores the “trigger” phrases that are used to identify conditional statements. They are typically conjunctions or conjunction phrases.

  • CONDITION_PATTERN_TEMPLATE: This String stores the regular expression pattern that drives matching in this module.

Note

For more examples and information about conditional statements, see the linguistic resources below:

The default behavior of this module can be customized by overriding the value of RE_CONDITION with a new regular expression created using create_condition_pattern above. The example below demonstrates a simple addition of a new phrase:

>>> # Out of the box behavior
>>> import lexnlp.extract.en.conditions
>>> text = "This will occur predicated upon something else."
>>> print(list(lexnlp.extract.en.conditions.get_conditions(text)))
[]

>>> # Customize the `RE_CONDITION` variable by adding a new phrase
>>> import regex as re
>>> my_condition_phrases = lexnlp.extract.en.conditions.CONDITION_PHRASES
>>> my_condition_phrases.append("predicated upon")
>>> CONDITION_PATTERN = lexnlp.extract.en.conditions.create_condition_pattern(lexnlp.extract.en.conditions.CONDITION_PATTERN_TEMPLATE, my_condition_phrases)
>>> lexnlp.extract.en.conditions.RE_CONDITION = re.compile(CONDITION_PATTERN, re.IGNORECASE | re.UNICODE | re.DOTALL | re.MULTILINE | re.VERBOSE)

>>> # Run the `get_conditions` method again to test
>>> print(list(lexnlp.extract.en.conditions.get_conditions(text)))
[('predicated upon', 'This will occur', '')]