html_cleaner
Script to write a nicely indented html doc.
Mainly used to debug the input to the converter.HTMLContentConverter.
main()
Convert an html file, and print the result to outfile.
Source code in /home/anders/projects/CorpusTools/corpustools/html_cleaner.py
50 51 52 53 54 55 56 |
|
parse_args()
Parse the commandline options.
Returns:
Type | Description |
---|---|
argparse.Namespace
|
the parsed commandline arguments |
Source code in /home/anders/projects/CorpusTools/corpustools/html_cleaner.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
|