LexNLP is a library for working with real, unstructured legal text, including contracts, plans, policies, procedures, and other material. LexNLP provides functionality such as:
- Segmentation and tokenization, such as
A sentence parser that is aware of common legal abbreviations like LLC. or F.3d.
Pre-trained segmentation models for legal concepts such as pages or sections.
Pre-trained word embedding and topic models, broadly and for specific practice areas
Pre-trained classifiers for document type and clause type
Broad range of fact extraction, such as:
Monetary amounts, non-monetary amounts, percentages, ratios
Conditional statements and constraints, like “less than” or “later than”
Dates, recurring dates, and durations
Courts, regulations, and citations
Tools for building new clustering and classification methods
Hundreds of unit tests from real legal documents
LexNLP is often used as part of ContraxSuite, an open source contract analytics and document exploration platform built by LexPredict. LexNLP and ContraxSuite are related through the following project structure:
ContraxSuite web application: https://github.com/LexPredict/lexpredict-contraxsuite
LexNLP library for extraction: https://github.com/LexPredict/lexpredict-lexnlp
ContraxSuite pre-trained models and “knowledge sets”: https://github.com/LexPredict/lexpredict-legal-dictionary
ContraxSuite agreement samples: https://github.com/LexPredict/lexpredict-contraxsuite-samples
ContraxSuite deployment automation: https://github.com/LexPredict/lexpredict-contraxsuite-deploy
Citing for academic use¶
We are currently drafting and submitting a technical whitepaper describing the LexNLP library and documenting its performance on a large corpus of gold-standard contracts. Please contact us at email@example.com for more information on citing in academic use.