Skip to content

IsmaelMousa/modern-bert-ner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ModernBERT for Named Entity Recognition (NER)

A fine-tuned ModernBERT-base model for Named Entity Recognition, trained on the CoNLL-2003 dataset to identify persons, organizations, locations, and miscellaneous entities in English text.

Model Overview

This model achieves strong performance on NER tasks:

  • F1 Score: 84.55%
  • Precision: 83.49%
  • Recall: 85.63%
  • Accuracy: 97.52%

Quick Start

from transformers import pipeline

ner     = pipeline(task="token-classification", model="IsmaelMousa/modernbert-ner-conll2003", aggregation_strategy="max")

text    = "Hi, I'm Ismael Mousa from Palestine working for NVIDIA inc."

results = ner(text)

for entity in results: print(f"{entity['word']} => {entity['entity_group']}")

Output:

Ismael Mousa => PER
Palestine    => LOC
NVIDIA       => ORG

Entity Types

The model recognizes four entity categories:

  • PER: Person names
  • ORG: Organizations
  • LOC: Locations
  • MISC: Miscellaneous entities

Training Details

  • Base Model: ModernBERT-base (149.6M parameters)
  • Dataset: CoNLL-2003
  • Training Examples: 14,041
  • Epochs: 10
  • Learning Rate: 1e-6

Performance

Evaluated on CoNLL-2003 validation set (3,250 examples):

Metric Score
F1 0.8455
Precision 0.8349
Recall 0.8563
Accuracy 0.9752

Implementation

The training process involves:

  1. Tokenizing text with word-level alignment
  2. Mapping labels to handle subword tokenization
  3. Training with cross-entropy loss on token classification
  4. Evaluating with seqeval metrics

See the training notebook for complete implementation details.

References

License

Apache 2.0

About

A finetuned ModernBERT model for named entity recognition (NER), trained on the CoNLL-2003 dataset to identify persons, organizations, locations, and miscellaneous entities in english text

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors