Modules & Commands¶

hfst¶

Morphological analysis with finite state transducers.

tokenize

Tokenize text using PMHFST model.

let x = hfst.tokenize(entry, {
    model_path: "tokeniser.pmhfst"
});

Input: String | Output: String (CG3 format)

cg3¶

Constraint Grammar disambiguation and processing.

vislcg3

Apply CG3 rules from compiled grammar.

let x = cg3.vislcg3(input, {
    model_path: "grammar.bin",
    config: { trace: false }
});

Input: String (CG3) | Output: String (CG3)

Tip

Enable tracing: -c 'vislcg3={"config":{"trace":true}}'

mwesplit

Split multi-word expressions.

let x = cg3.mwesplit(input);

Input: String (CG3) | Output: String (CG3)

streamcmd

Insert CG3 stream commands (SETVAR, REMVAR).

let x = cg3.streamcmd(input, { key: "SETVAR" });

Input: String (CG3) | Output: String (CG3)

Tip

Configure: -c 'cmd-id="variable=value"'

sentences

Extract sentences from CG3 stream.

let sentences = cg3.sentences(input, {
    mode: "surface"  // or "phonological" for TTS
});

Input: String (CG3) | Output: ArrayString

to_json

Convert CG3 output to JSON.

let json = cg3.to_json(input);

Input: String (CG3) | Output: Json

divvun¶

Spell/grammar checking and suggestions.

blanktag

Analyze whitespace using HFST.

let x = divvun.blanktag(input, {
    model_path: "analyser-gt-whitespace.hfst"
});

Input: String (CG3) | Output: String (CG3)

cgspell

Spell check with error models.

let x = divvun.cgspell(input, {
    err_model_path: "errmodel.hfst",
    acc_model_path: "acceptor.hfst",
    config: {
        n_best: 10,
        max_weight: 5000.0,
        beam: 15.0,
        recase: true
    }
});

Input: String (CG3) | Output: String (CG3 with suggestions)

suggest

Generate error report with suggestions.

let errors = divvun.suggest(input, {
    model_path: "generator.hfstol"
});

Input: String (CG3 with error tags) | Output: Json (error array)

Tip

Configure locales and filters: -c 'suggest={"locales":["fo","en"],"ignore":["typo"]}'

speech¶

Text-to-speech synthesis.

Note

Speech features must be enabled during build.

normalize

Normalize text for TTS.

let x = speech.normalize(input, {
    normalizers: { "Sem/Plc": "place-norm.hfst" },
    generator: "generator.hfst",
    analyzer: "analyzer.hfst"
});

Input: String (CG3) | Output: String (CG3 with phonological forms)

phon

Add phonological forms.

let x = speech.phon(input, {
    model: "phon.hfst",
    tag_models: { "Prop": "phon-prop.hfst" }
});

Input: String (CG3) | Output: String (CG3 with phon tags)

tts

Synthesize speech.

let audio = speech.tts(sentences, {
    voice_model: "voice.onnx",
    univnet_model: "vocoder.onnx",
    speaker: 0,
    language: 0,
    alphabet: "sme"  // "sme", "smj", "sma", "smi"
});

Input: String or ArrayString | Output: Bytes (WAV audio)

Tip

Override speaker: -c 'tts-cmd={"speaker":1}'

example¶

Learning and demo functions.

reverse

Reverse a string.

let x = example.reverse(entry);

upper

Convert to uppercase.

let x = example.upper(entry);