Skip to topic | Skip to bottom

Provenance Challenge

Challenge
Challenge.Karma2

Start of topic | Skip to actions

Second Provenance Challenge Template

Participating Team

Differences from First Challenge

Note here any changes in your provenance representation, workflow enactment or system since the first challenge. Alternatively, if you did not participate in the first challenge, please provide the same details as were required for those who did (particularly workflow representation and provenance representation).

Karma has a provenance collection part in the form of provenance activities collected from workflow executions and a provenance dissemination part in the form of views generated from the activities. In the First Challenge, we exposed the views of the provenance collected, such as Workflow Trace and Data Provenance. The views were themselves not sufficient to answer all queries and the actual provenance activities are necessary. Hence, in addition to the provenance views, we are also submitting the provenance activities we collect that gives the complete description for the workflow run that has sufficient information to answer all queries.

Provenance Data for Workflow Parts

Give links here to your provenance data files for the workflow parts of the challenge: three parts for the original workflow and three parts for the modified workflow (as per provenance query 7). The data files could be attached to the results page.

Workflow Model

  Workflow Part-I Workflow Part-II Workflow Part-III Workflow Part-III-Alternate
Image Workflow Image for Part-I Workflow Image for Part-II Workflow Image for Part-III Workflow Image for Part-III (Alternate for Q7)
XBaya Document challenge2-Part-I.xwf challenge2-Part-II.xwf challenge2-Part-III.xwf challenge2-Part-III-Alternate.xwf

You can view the documents in the XBaya Workflow Composer [a java web Start application].

Provenance Activities (Provenance Data to Import)

Each activity is a individual XML document. For convenience, they have been concatenated into a single XML document under a single root element for each workflow.

(For an example of a provenance view that is generated in Karma from these activities, see Stage I WorkflowTrace, Stage II WorkflowTrace, Stage III WorkflowTrace, and Alternate Stage III WorkflowTrace. These views are for sample only and not expected to be used for importing.)

Provenance Activity Description & Utilities

Here is a description of the key provenance activities that can help with importing the data. The activity types described here encapsulate all information required to answer the Challenge queries. Other activity types present in the dump above are useful for more complex querying and monitoring requirements.

In our model, we make a distinction between and abstract service and a service instance. Workflows are composed by connecting abstract services, while instances of services are used during workflow invocation through late binding. Service instances are identified using a globally unique 'serviceID'. Workflows are also considered as a type of service. Hence, you can compose an abstract workflow out of abstract services and out of other abstract workflows. Workflow instances also have a globally unique 'serviceID' that identifies the workflow instance. Data products are identified by a globally unique ID and they optionally have a replica URL associated with them when they appear in activities.

Service instances are usually invoked in the context of a workflow. An invocation is identified by 4 parts:

  1. the serviceID of the service being invoked,
  2. the (parent) workflow instance in whose context this invocation takes place is identified using the 'workflowID' attribute (whose value equals the serviceID for the workflow),
  3. the 'workflowTimestep' gives a logical time for the service invocation in the workflow lifecycle, and
  4. the 'workflowNodeID' uniquely identifies a node in the workflow graph (the same abstract service used multiple times in an abstract workflow will have different workflowNodeIDs).

All activities have these 4-IDs set as the notification source for that activity. They also optionally have the ID for the client that invokes this service/receives the response. All activities have a timestamp file that gives the time at which they were generated, a human readable description, and an optional XML 'annotation' field for extensions.

Some key activities in a workflow's lifecycle (in order of generation) are:

  • Service/Workflow Initialized : These activities are generated at the start of a workflow instance or a service instance. These are generated only if a new instance is created, so they may not be published for all workflow runs if a compatible service/workflow instance was already available. Since these activities are not generated in the context of a service/workflow invocation (since it can be reused by multiple workflows), the notification source for this activity only has the serviceID field set.
  • Service/Workflow Invoked : An invocation consists of a request message and an optional response message, depending on the message exchange pattern. For the challenge workflows, both are present. The Service/Workflow Invoked activity is produced when the request message of the invocation is received by the service instance. Other than default fields, it contains an optional request header and request body elements for putting the SOAP (or equivalent) request Header and Body.
  • Data Consumed : Describes a data product that is used by this invocation. Provides better guarantees than looking for data IDs in SOAP messages.
  • Data Produced : Describes a data product that is generated by this invocation.
  • Sending Result : This is produced when the response message of the invocation sent by the service instance to the receiver (client). Other than default fields, it contains an optional result header and result body elements for putting the SOAP (or equivalent) response Header and Body.
  • Service/Workflow Terminated : Decribes a service or workflow instance that is shutting down.

Karma libraries are available for reading the XML activities as Apache XML Bean objects in Java.

Model Integration Results

State here which combinations of teams' models you have managed to perform the provenance query over

Translation Details

Describe details regarding how data models were translated (or otherwise used to answer the query following the team's approach), any data which was absent from a downloaded model, and whether this affected the possibility of translation or successful provenance query, and any data which was excluded in translation from a downloaded model because it was extraneous

Benchmarks

Describe your proposed benchmark queries, how the comparable quantities are determined, and the results of applying the benchmark to your own system

Further Comments

Provide here further comments.

Conclusions

Provide here your conclusions on the challenge, and issues that you like to see discussed at a face to face meeting.

-- YogeshSimmhan - 22 Feb 2007
to top

I Attachment sort Action Size Date Who Comment
challenge2-Part-I-Activities.xml manage 113.1 K 22 Feb 2007 - 02:22 YogeshSimmhan Provenance Activities for Workflow Part-I
challenge2-Part-II-Activities.xml manage 29.2 K 22 Feb 2007 - 01:48 YogeshSimmhan Provenance Activities for Workflow Part-II
challenge2-Part-III-Activities.xml manage 65.2 K 22 Feb 2007 - 03:11 YogeshSimmhan Provenance Activities for Workflow Part-III
challenge2-Part-III-Alternate-Activities.xml manage 131.8 K 22 Feb 2007 - 01:49 YogeshSimmhan Provenance Activities for Workflow Part-III-Alternate for Q7

Copyright © 1999-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback