Usage¶
senfd --help
usage: senfd [-h] [--output OUTPUT] [--dump-schema] [--version] [document ...]
Semantically organize and enrich figures
positional arguments:
document path to one or more document(s)
options:
-h, --help show this help message and exit
--output OUTPUT directory where the output will be saved
--dump-schema dump schema(s) and exit
--version print the version and exit
Example¶
Place yourself in the root of the repository and run:
senfd example/example.docx --output /tmp/foo
This will extract table and figure information from the .docx file,
storing it as a FigureDocument with minimal semantic enrichment, then the
FigureDocument is processed producing a CategorizedFigureDocument with
figures categorized by the content they are captioning.
For all of the output files then they are stored in the directory pointed to
by --output, in this case /tmp/foo. Each input-document gets a folder
dedicated to the output files related to it.
In case you do not want to run it, then you can inspect the output files in the
repository on GitHUB
or locally in the folder example/output/document1.
For details on the structure of the JSON documents, then have a look at the schema section.
Auxiliary¶
The following tools are also convenient to have available when inspecting the JSON files:
jless - https://jless.io/