PRE_PROCESS_TEXT_REMOVE¶
-
lexnlp.nlp.en.segments.sentences.
PRE_PROCESS_TEXT_REMOVE
= re.compile('(?:^\\s*\\d+\\s*$)|(?:^\\s*\\<PAGE\\>\\s*(\\d+)?\\s*(\\n|$))|(?:^\\s*(^.+)?[Pp][Aa][Gg][Ee]\\s+\\d+\\s+[Oo][Ff]\\s+\\d+(.+)?$\\s*(\\n|$))|(?:^\\s+$)|(?:^\\s*i+\\s*$)', re.MULTILINE)¶ Compiled regular expression objects