Skip to content

farscrl/rumantsch-language-tools

Repository files navigation

Rumantsch Language Tools

A collection of tools for working with the Rumantsch (Romansh) language, published as an npm package.

Installation

pnpm add @farscrl/rumantsch-language-tools

Features

  • Tokenizer — splits text into individual tokens, handling Romansh-specific abbreviations and punctuation (based on stdlib-js/nlp-tokenize)
  • Proofreader — hunspell-based spellchecker supporting all Romansh idioms

Supported idioms

Code Idiom
rm-puter Puter
rm-rumgr Rumantsch Grischun
rm-surmiran Surmiran
rm-sursilv Sursilvan
rm-sutsilv Sutsilvan
rm-vallader Vallader

Usage

Tokenizer

import { Tokenizer } from "@farscrl/rumantsch-language-tools";

Tokenizer.tokenize('In test dal tokenizer.');
// ['In', 'test', 'dal', 'tokenizer']

Proofreader

import { Proofreader } from '@farscrl/rumantsch-language-tools';

const proofreader = await Proofreader.CreateProofreader('rm-surmiran');

await proofreader.proofreadText('Text correct');
// [] — no errors

await proofreader.proofreadText('in');
// [{ word: 'in', offset: 0, length: 2 }]

await proofreader.getSuggestions('corect');
// ['correct', ...]

Dictionary files are fetched from https://www.spellchecker.pledarigrond.ch by default. To use a self-hosted mirror, pass a baseUrl option:

const proofreader = await Proofreader.CreateProofreader('rm-surmiran', {
  baseUrl: 'https://your-host.example.com'
});

Call unload() when you are done to free the dictionary from memory:

proofreader.unload();

Development

pnpm install      # install dependencies
pnpm test         # run tests
pnpm run build    # compile to lib/
pnpm run lint     # lint

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors