one-off-functions
One off funtions to set metadata.
Might be useful in other contexts.
find_endings(directories, suffix)
Find all files in with suffix within directories.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directories |
list of str
|
list of directories to walk |
required |
suffix |
str
|
files suffixes to be searched for |
required |
Yields:
Type | Description |
---|---|
str
|
path to file with suffix |
Source code in /home/anders/projects/CorpusTools/corpustools/one-off-functions.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
regjeringen_no(directories)
Set metadata for regjeringen.no html files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directories |
list of str
|
list of directories to walk |
required |
Source code in /home/anders/projects/CorpusTools/corpustools/one-off-functions.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
skuvla_historja(directories)
Find skuvlahistorja directories in paths, set year.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
directories |
list of str
|
list of directories to walk |
required |
Source code in /home/anders/projects/CorpusTools/corpustools/one-off-functions.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
to_free(path)
Set the lisence type.
Source code in /home/anders/projects/CorpusTools/corpustools/one-off-functions.py
81 82 83 84 85 86 87 88 89 |
|
translated_from(url_part, mainlang, directories)
Set all docs from url_part to be translated from mainlang.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
url_part |
str
|
the defining part of the url |
required |
mainlang |
str
|
three character long language code |
required |
directories |
list of str
|
list of directories to walk |
required |
Source code in /home/anders/projects/CorpusTools/corpustools/one-off-functions.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
|