Usage

senfd --help
usage: senfd [-h] [--output OUTPUT] [--dump-schema] [--version] [document ...]

Semantically organize and enrich figures

positional arguments:
  document         path to one or more document(s)

options:
  -h, --help       show this help message and exit
  --output OUTPUT  directory where the output will be saved
  --dump-schema    dump schema(s) and exit
  --version        print the version and exit

Example

Place yourself in the root of the repository and run:

senfd example/example.docx --output /tmp/foo

This will extract table and figure information from the .docx file, storing it as a FigureDocument with minimal semantic enrichment, then the FigureDocument is processed producing a CategorizedFigureDocument with figures categorized by the content they are captioning.

For all of the output files then they are stored in the directory pointed to by --output, in this case /tmp/foo. Each input-document gets a folder dedicated to the output files related to it.

In case you do not want to run it, then you can inspect the output files in the repository on GitHUB or locally in the folder example/output/document1.

For details on the structure of the JSON documents, then have a look at the schema section.

Auxiliary

The following tools are also convenient to have available when inspecting the JSON files: