clams.mmif_utils package

Package providing utility functions for working with MMIF data.

rewind module

This module provides a CLI to rewind a MMIF from a CLAMS pipeline.

clams.mmif_utils.rewind.describe_argparser()[source]

returns two strings: one-line description of the argparser, and addition material, which will be shown in clams –help and clams <subcmd> –help, respectively.

clams.mmif_utils.rewind.prompt_user(mmif_obj: Mmif) int[source]

Function to ask user to choose the rewind range.

clams.mmif_utils.rewind.rewind_mmif(mmif_obj: Mmif, choice: int, choice_is_viewnum: bool = True) Mmif[source]

Rewind MMIF by deleting the last N views. The number of views to rewind is given as a number of “views”, or number of “producer apps”. By default, the number argument is interpreted as the number of “views”. Note that when the same app is repeatedly run in a CLAMS pipeline and produces multiple views in a row, rewinding in “app” mode will rewind all those views at once.

Parameters:
  • mmif_obj – mmif object

  • choice – number of views to rewind

  • choice_is_viewnum – if True, choice is the number of views to rewind. If False, choice is the number of producer apps to rewind.

Returns:

rewound mmif object

source module

This module provides a class for creating a “source” MMIF JSON object.

class clams.mmif_utils.source.WorkflowSource(common_documents_json: List[str | dict] | None = None, common_metadata_json: str | dict | None = None)[source]

A WorkflowSource object is used at the beginning of a CLAMS workflow to populate a new MMIF file with media.

The same WorkflowSource object can be used repeatedly to generate multiple MMIF objects.

Parameters:
  • common_documents_json – JSON doc_lists for any documents that should be common to all MMIF objects produced by this workflow.

  • common_metadata_json – JSON doc_lists for metadata that should be common to all MMIF objects produced by this workflow.

add_document(document: str | dict | Document) None[source]

Adds a document to the working source MMIF.

When you’re done, fetch the source MMIF with produce().

Parameters:

document – the medium to add, as a JSON dict or string or as a MMIF Medium object

change_metadata(key: str, value)[source]

Adds or changes a metadata entry in the working source MMIF.

Parameters:
  • key – the desired key of the metadata property

  • value – the desired value of the metadata property

from_data(doc_lists: Iterable[List[str | dict | Document]], metadata_objs: Iterable[str | dict | MmifMetadata | None] | None = None) Generator[Mmif, None, None][source]

Provided with an iterable of document lists and an optional iterable of metadata objects, generates MMIF objects produced from that data.

doc_lists and metadata_objs should be matched pairwise, so that if they are zipped together, each pair defines a single MMIF object from this workflow source.

Parameters:
  • doc_lists – an iterable of document lists to generate MMIF from

  • metadata_objs – an iterable of metadata objects paired with the document lists

Returns:

a generator of produced MMIF files from the data

prime() None[source]

Primes the WorkflowSource with a fresh MMIF object.

Call this method if you want to reset the WorkflowSource without producing a MMIF object with produce().

produce() Mmif[source]

Returns the source MMIF and resets the WorkflowSource.

Call this method once you have added all the documents for your Workflow.

Returns:

the current MMIF object that has been prepared