NLP_IE_Pipelines

This is a general Natural Language Processing (NLP) system that comprised of a Named Entity Recognition (NER) module and a Relation Extraction (RE) module. The Information Extraction Document (IE) class is the main data structure used through out the training, evaluation, and prediction for both NER and RE.

Main frameworks: PyTorch, Transformers (Hugging Face)

Supported annotation tools: Label-studio, BRAT, MAE

Development pipeline overview

The annotations are first converted to IE, then loaded by Dataset (PyTorch) to create training instances.

Prediction pipeline overview

The raw text for information extraction is loaded and converted into IE. Then a fine-tuned NER model makes prediction on the IEs and outputs IEs with entities. An RE model then inpupts the IEs after NER and outputs IEs with entities and relations.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
IE		IE
data		data
doc_id		doc_id
pipelines		pipelines
scripts		scripts
.gitignore		.gitignore
Development pipeline overview.png		Development pipeline overview.png
LICENSE		LICENSE
Pipelines flowchart.pptx		Pipelines flowchart.pptx
Prediction pipeline overview.png		Prediction pipeline overview.png
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_IE_Pipelines

Development pipeline overview

Prediction pipeline overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLP_IE_Pipelines

Development pipeline overview

Prediction pipeline overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages