The CLAMS project has many moving parts to make various computational analysis tools talk to each other to create customized workflow pipelines. However the most important part of the project must be the apps published for the CLAMS platform. The CLAMS Python SDK will help app developers handling MMIF data format with high-level classes and methods in Python, and publishing their code as a CLAMS app that can be easily deployed to the site via CLAMS workflow engines, such as the CLAMS appliance.
A CLAMS app can be any software that performs automated contents analysis on text, audio, and/or image/video data stream. An app must use MMIF as I/O format. When deployed into a CLAMS appliance, an app needs be running as a webapp wrapped in a docker container. In this documentation, we will explain what Python API’s and HTTP API’s an app must implement.
clams-python distribution package is available at PyPI. You can use
pip to install the latest version.
pip install clams-python
Note that installing
clams-python will also install
mmif-python PyPI package, which is a companion python library related to the MMIF data format.
A CLAMS app must be able to take a MMIF json as input as well as to return a MMIF json as output. MMIF is a JSON(-LD)-based open source data format. For more details and discussions, please visit the MMIF website and the issue tracker.
mmif-python PyPI package comes together with the installation of
clams-python, and with it, you can use
mmif python package.
import mmif from mmif.serialize import Mmif new_mmif = Mmif() # this will fail because an empty MMIF will fail to validate against MMIF JSON schema
Because API’s of the
mmf package is well documented in the mmif-python website, we won’t go into more details here. Please refer to the website.
Note on versions¶
clams-python is under active development, so is
mmif-python, which is a separate PyPI distribution package providing Python classes and methods to handle MMIF JSON string. Because of this rapid version cycles, it is possible that a MMIF file of a specific version does not work with a CLAMS app that is based on a incompatible version of
mmif-python. In every MMIF files, there must be the MMIF version encoded at the top of the file. Please keep in mind the versions of Python libraries you’re using and their target specification version. To see the MMIF specification version targeted by the installed
mmif-python package, look at
mmif.__specver__ variable (installing
clams-python will also install
import mmif mmif.__specver__
A CLAMS app must report which MMIF specification version it targets in its metadata (see CLAMS App Metadata). And when an app targets a specific version, it means;
the app can only process input MMIF files that are compatible with the target version.
the app will output MMIF files exactly versioned as the target version.
Finally, for two different specifications to be compatible to each other, their
minor version numbers should be the same. For example,
0.4.2 is compatible either with
0.4.0, but is not compatible with
For more information on the relation between
mmif-python versions and MMIF specification versions, or MMIF version compatibility, please take time to read our decision on the subject here. You can also find a table with all public
clams-python packages and their target MMIF versions in Target MMIF Versions. This is a seemingly complicated issue, but also is very crucial to build CLAMS as a platform.
CLAMS App API¶
A CLAMS Python app is a python class that implements and exposes two core methods;
appmetadata(): Returns JSON-formatted
strthat contains metadata about the app.
annotate(): Takes a MMIF as the only input and processes the MMIF input, then returns serialized MMIF
A good place to start writing a CLAMS app is to start with inheriting
clams.app.ClamsApp. And if you’re doing so, you might want to implement two private methods instead of two public methods above. That’s because the implementation of the public methods in the super class internally call these private methods respectively.
We provide a tutorial for writing with a real world example at <Tutorial: writing a CLAMS app>. We highly recommend you to go through it.
Note on App metadata¶
App metadata is a map where important information about the app itself is stored as key-value pairs. See <CLAMS App Metadata> for the specification. In the future the app metadata will be used for automatic generation of CLAMS App index in the CLAMS App Directory, as well as automatic integration to Galaxy in the appliance deployment.
To be integrated into the CLAMS appliance, a CLAMS app needs to serve as a webapp. Once your application class is ready, you can use
clams.restify.Restifier to wrap your app as a Flask-based web application.
from clams.app import ClamsApp from clams.restify import Restifier class AnApp(ClamsApp): # Implements an app that does this and that. # Must implement `_appmetadata`, `_annotate` methods if __name__ == "__main__": app = AnApp() webapp = Restifier(app) webapp.run()
When running the above code, Python will start a web server and host your CLAMS app. By default the serve will listen to
0.0.0.0:5000, but you can adjust hostname and port number. In this webapp,
annotate will be respectively mapped to
POST to the root route. Hence, for example, you can
POST a MMIF file to the web app and get a response with the annotated MMIF string in the body.
In the above example,
clams.restify.Restifier.run() will start the webapp in debug mode on a Werkzeug server, which is not always suitable for a production server. For a more robust server that can handle multiple requests asynchronously, you might want to use a production-ready HTTP server. In such a case you can use
serve_production(), which will spin up a multi-worker Gunicorn server. If you don’t like it (for example, gunicorn does not support Windows OS), you can write your own HTTP wrapper. In the end of the day, all you need is a webapp that maps
In addition to the HTTP service, a CLAMS app is expected to be containerized. Concretely, the appliance maker expects a CLAMS app to have a
Dockerfile at the project root. Independently from being compatible with the CLAMS appliance, containerization of your app is recommended especially when your app processes video streams and dependent on complicated system-level video processing libraries (e.g. OpenCV, FFmpeg).
Refer to the official documentation to learn how to write a
Dockerfile. To integrate to the CLAMS appliance, a dockerized CLAMS app must automatically start itself as a webapp when instantiated as a container, and listen to
5000 port in the container.
We have a public docker hub, and publishing Debian-based base images to help developers write
Dockerfile and save build time to install common libraries. At the moment we have a basic image with Python 3.6 and
clams-python installed. We will publish more images built with commonly used video and audio processing libraries.
CLAMS appliance integration¶
Finally, here are requirements for an app to be appliance compatible.
App code is hosted on a public git repository.
App is dockerized
The app docker image will automatically start the app as a webapp, and listen to port 5000.
Dockerfilefor the dockerization is placed in the root of the git repository
To learn how to deploy your app on an appliance instance, please refer to the appliance documentation.