generate_anchor_list
Generate an anchor file needed by the java aligner.
GenerateAnchorList
Generate anchor list used by tca2.
Source code in /home/anders/projects/CorpusTools/corpustools/generate_anchor_list.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
|
__init__(lang1, lang2, columns, input_file)
Initialise the GenerateAnchorList class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lang1 |
str
|
the main lang |
required |
lang2 |
str
|
the translated lang |
required |
columns |
list of str
|
contains all the possible langs found in the main anchor file. |
required |
input_file |
str
|
path of the existing anchor file. |
required |
Source code in /home/anders/projects/CorpusTools/corpustools/generate_anchor_list.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
|
generate_file(outpath, quiet=False)
infiles is a list of file paths.
Source code in /home/anders/projects/CorpusTools/corpustools/generate_anchor_list.py
76 77 78 79 80 81 82 83 84 85 |
|
read_anchors(quiet=False)
List of word-pairs in infiles, empty/bad lines skipped.
Source code in /home/anders/projects/CorpusTools/corpustools/generate_anchor_list.py
67 68 69 70 71 72 73 74 |
|
words_of_line(lineno, line)
Either a word-pair or None, if no word-pair on that line.
Source code in /home/anders/projects/CorpusTools/corpustools/generate_anchor_list.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
|