General Update For Paragraph Splitting, Metadata Preservation, and MongoDB Prediction by walkernr · Pull Request #9 · lbnlp/MatBERT_NER

walkernr · 2021-10-26T20:02:33Z

These changes implement a more robust paragraph splitting approach that resolves many edge-case errors. Additionally, metadata containing the DOI and paragraph number are now passed through the dataloader and prediction function. Various errors with the prediction function caused by incompatibility with the model trainer were also addressed as well as problems with MongoDB prediction.

…ove compatibility with different input formats for prediction

walkernr added 30 commits August 24, 2021 13:53

fix issue with custom datafile

16dda9c

debugging mongo predict

f71b6ee

prevent bad partition when splitting paragraphs

7b95412

debug

915704e

debug

e87f9aa

debug

dc2b036

debug

0438383

debug

6e927a3

debug

f36cc69

fix issue with unsplittable paragraphs and prediction function

363075e

switched predict script to gpu

afc2a47

fix in predict script

6655cd5

mem fix

db1b490

pass metadata

d1f79b8

pass labels

dc9405b

predict script update

cb7c5ca

mem check

d1336d5

pass metadata

ba04723

debug metadata

8174d1b

fix merge of original and annotated data

3d21bc6

bugfix

1703692

bugfix

0355d3a

bugfix

6bbf504

newest prediction format

0e12134

changed predictions to update original entries, not cleaned

70b6a45

change pytorch pickle outputs for histories and metrics to JSON, impr…

d08ebf6

…ove compatibility with different input formats for prediction

np encoder for metrics and support for cased matbert

1e3855c

dictionary support in training script

ede2f5a

full dict return for annotations in train script

53ef066

corrected doping dataset

867e1f0

walkernr added 5 commits January 17, 2022 00:51

corrected doping dataset

1fc2371

corrected doping dataset

9472336

readme update

6ccd879

readme update

b939802

add model weights

8a35533

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General Update For Paragraph Splitting, Metadata Preservation, and MongoDB Prediction#9

General Update For Paragraph Splitting, Metadata Preservation, and MongoDB Prediction#9
walkernr wants to merge 35 commits into
lbnlp:mainfrom
walkernr:main

walkernr commented Oct 26, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

walkernr commented Oct 26, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant