Karma

Karma Tool for Provenance Collection and Storage

Overview

Provenance (or lineage, trace) of digital scientific data is a critical component to broadening sharing and reuse of scientific data. Provenance captures the information needed to attribute ownership and determine, among other things, the quality of a particular data set. Provenance collection is often a tightly coupled part of a cyberinfrastructure system, but is better served as a standalone tool. The Karma tool is a standalone tool that can be added to existing cyberinfrastructure for purposes of collection and representation of provenance data. Karma utilizes a modular architecture that permits support for multiple instrumentation plugins that make it usable in different architectural settings.

Karma originates in the Linked Environments for Atmospheric Discovery project and has since been generalized as a standalone tool. The tool is being applied to the Life Science Grid, an open source biochem discovery cyberinfrastructure promoted by Eli Lilly Corp., and in other settings. The current release is Karma 2.1.

Download

The current version of Karma (v2.1) supports provenance activities published from services, workflows and nested workflows. The provenance data is efficiently stored in a relational database, and supports the emerging Open Provenance Model (OPM) standard for interfacing with the tool. Karma supports data oriented and process oriented views of the provenance. Karma v2,1 supports either synchronous submission of provenance as activities using a web-services API or a scalable asynchronous mode using WS-Eventing notifications. Provenance clients can use the Notifier library to generate provenance activities. Synchronous recording is suggested for recording provenance in the scale of hundreds of workflows or if setup of a notification broker is to be avoided. The WS-Messenger notification broker is the suggested WS-Eventing implementation to use. Karma service requires the availability of a MySQL v5.0 or later with a database assigned to it (preferably named 'karma2').

Karma v2.1.0 The Karma distribution files required to run the provenance service and GUI [source] [binary]

Notifier v1.0 The Notifier (i.e., Provenance Tracking) library required to publish provenance activities [source] [binary]

WS-Messenger is a WS-Eventing based notification broker used to asynchronously publish provenance activites. More

Documentation

Upcoming release: Karma V3.0 (Expected release: early Spring, 2010)

Karma v3.0 will support instrumentation through Axis 2 handlers in addition to Java applications, and a full test suite and better documentation.

Support for asynchronous communication using WS-Messenger will be a follow-on release. WS-Messenger is being re-released through OGCE with Axis2 support. Future releases will support additional instrumentation toolkits, and better query support.

Future (late Spring): Early Spring Karma v3.0: Karma v2.1:
Karma server (query and publish web service) Karma server (query and publish web service) Karma server (query and publish web service)
V3.0 plus: Asynch communication with other pub/sub systems; additional instrumentation V2.1 plus: Axis-2 handler instrumentation; test suite; fuller documentation Java notifier library (synch calls to Karma service)
V3.0 plus: Axis-2 WS-Messenger; richer clients: preservation client, visualization, query V2.1 plus: simple access client OPM RDF and XML results
Dependencies: V2.1 plus WS-Messenger (latter optional) Dependencies: v2.1 Dependencies: MySQL

Publications

For a list of publications related to Karma please click here

Contact

  • Girish Subramanian [Email]
  • Professor Beth Plale [Email]