finder
Manage corpus files in various ways.
move_twenty_percent_to_goldcorpus()
Move twenty percent of the files to the goldcorpus
Source code in /home/anders/projects/CorpusTools/corpustools/finder.py
352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 |
|
remove_files_with_duplicate_content()
To replace: 123, , 339, 340
Source code in /home/anders/projects/CorpusTools/corpustools/finder.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
|