lexnlp.extract.en.courts: Extracting court references

The lexnlp.extract.en.courts module contains methods that allow for the extraction of court or venue references from text.

Attention

The methods in this module rely heavily on data from the LexPredict Legal Dictionary repository: https://github.com/LexPredict/lexpredict-legal-dictionary

This repository includes courts such as:
  • Australian courts
  • Canadian courts
  • German courts
  • US Federal and State courts

This data is governed by a separate Creative Commons Attribution Share Alike 4.0 license here: https://github.com/LexPredict/lexpredict-legal-dictionary/blob/master/LICENSE

The full list of current unit test cases can be found here: https://github.com/LexPredict/lexpredict-lexnlp/tree/master/test_data/lexnlp/extract/en/tests/test_courts

Extracting courts

lexnlp.extract.en.courts.get_courts()

Searches for courts from the provided config list and yields tuples of (court_config, court_alias). Court config is: (court_id, court_name, [list of aliases]) Alias is: (alias_text, language, is_abbrev, alias_id)

This method uses general searching routines for dictionary entities from dict_entities.py module. Methods of dict_entities module can be used for comfortable creating the config: entity_config(), entity_alias(), add_aliases_to_entity(). :param text: :param court_config_list: List list of all possible known courts in the form of tuples:

(id, name, [(alias, lang, is_abbrev], …).
Parameters:
  • return_source
  • priority – If two courts found with the totally equal matching aliases - then use the one with the lowest id.
  • text_languages – Language(s) of the source text. If a language is specified then only aliases of this
language will be searched for. For example: this allows ignoring “Island” - a German language
alias of Iceland for English texts.
Returns:Generates tuples: (court entity, court alias)

Example

>>> # Manually set court configuration data
>>> import lexnlp.extract.en.courts
>>> text = "The case will be heard in E.D. Va. next month"
>>> court_config_data = [entity_config(0, "Eastern District of Virginia", 0, ["E.D. Va."]),
    entity_config(1, "Western District of Virginia", 0, ["W.D. Va."])]
>>> for entity, alias in lexnlp.extract.en.courts.get_courts(text, court_config_data):
    print("entity=", entity)
    print("alias=", alias)
entity= (0, 'Eastern District of Virginia', 0, [('Eastern District of Virginia', None, False, None), ('E.D. Va.', None, False, None)])
alias= ('E.D. Va.', None, False, None)

>>> # Load court configuration data automatically from LexPredict legal dictionaries
>>> import pandas
>>> text = "To be heard in either E.D. Va. or S.D.N.Y."
>>> court_df = pandas.read_csv("https://raw.githubusercontent.com/LexPredict/lexpredict-legal-dictionary/1.0.5/en/legal/us_courts.csv")
>>> # Create config objects
>>> court_config_data = []
>>> for _, row in court_df.iterrows():
    c = entity_config(row["Court ID"], row["Court Name"], 0, row["Alias"].split(";") if not pandas.isnull(row["Alias"]) else [])
    court_config_data.append(c)
>>> for entity, alias in lexnlp.extract.en.courts.get_courts(text, court_config_data):
    print("entity=", entity)
    print("alias=", alias)
entity= (98, 'Eastern District of Virginia', 0, [('Eastern District of Virginia', None, False, None), ('E.D. Va.', None, False, None)])
alias= ('E.D. Va.', None, False, None)
entity= (70, 'Southern District of New York', 0, [('Southern District of New York', None, False, None), ('S.D.N.Y.', None, False, None)])
alias= ('S.D.N.Y.', None, False, None)