Skip to topic | Skip to bottom

Provenance Challenge

Challenge
Challenge.MINDSWAP

Start of topic | Skip to actions
This page in progress

Participating Team

  • Short team name: MINDSWAP
  • Participant names: Jennifer Golbeck
  • Project URL: http://provenance.mindswap.org
  • Project Overview: Using Semantic Web Technologies for Provenance
  • Provenance-specific Overview:
  • Relevant Publications:

Jennifer Golbeck. 2006. Combining Provenance with Trust in Social Networks for Semantic Web Content Filtering. Proceedings of the International Provenance and Annotation Workshop. Chicago, Illinois, May, 2006. ( http://trust.mindswap.org/papers/IPAW_Trust.pdf )

Christian Halaschek-Wiener, Jennifer Golbeck, Andrew Schain, Michael Grove, Bijan Parsia, and Jim Hendler. Annotation and provenance tracking in semantic web photo libraries.Proceedings of the International Provenance and Annotation Workshop (IPAW). Chicago, Illinois, May 2006. ( http://www.mindswap.org/papers/2006/IPAW_PhotoStuff.pdf )

Workflow Representation

The workflow has been encoded using an OWL ontology at http://provenance.mindswap.org/provenance.owl

Provenance Trace

The structure of the provenance trace can be extracted by following the logical and rule-based connections of instances of the ontology. Our ontology, instance data, and rules are all expressed in OWL. They can be browsed at http://provenance.mindswap.org (this site is still being constructed, so pardon any errors). A visualization shows some of the connections to a certain depth on each page. Transitive roles in the ontology allow provenance tracking back up to the original files. SPARQL queries are used for finding any more complex information.

Provenance Queries Matrix

Teams Queries
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9
MINDSWAP team thumbs up thumbs up thumbs up thumbs up thumbs up thumbs up thumbs up thumbs up thumbs up

Provenance Queries

Queries can be run in real time at http://www.mindswap.org/~golbeck/queries.html

Write your own queries at http://provenance.mindswap.org/query/

We use SPARQL for all of our queries. Below is the syntax that we use for meeting each of the queries in the challenge. In some instances (for example, query 1), we give a full URI. Those URIs refer to a specific instance of a graphic or Service Execution. Those URIs can be replaced to perform the query about any other instance. They also can be substituted with a variable to query for the data related to any instance.

  1. Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is. This should tell us the new brain images from which the averaged atlas was generated, the warping performed etc.

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?prop ?x WHERE { <http://provenance.mindswap.org/files/ConvertExecution-1155932184.62982.owl#Graphic1155932184.62982> ?prop ?x.} <p />
  2. Find the process that led to Atlas X Graphic, excluding everything prior to the averaging of images with softmean.

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?x ?y ?z WHERE { ?x ?y ?z. <http://provenance.mindswap.org/files/ConvertExecution-1155932184.62982.owl#Graphic1155932184.62982> ?prop ?x FILTER ( ?prop = prov:hasServiceExecutionAncestor || ?prop = prov:producedFromServiceExecution). ?x prov:hasOutputFile ?f. ?f ?prop2 ?serv FILTER ( ?prop2 = prov:hasServiceExecutionAncestor || ?prop2 = prov:producedFromServiceExecution). ?serv prov:serviceUsed <http://provenance.mindswap.org/provenance.owl#softmean> }

  3. Find the Stage 3, 4 and 5 details of the process that led to Atlas X Graphic.

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?x ?propX ?obX ?y ?propY ?obY ?z ?propZ WHERE { <http://provenance.mindswap.org/files/ConvertExecution-1155932184.62982.owl#Graphic1155932184.62982> prov:hasServiceExecutionAncestor ?x; prov:hasServiceExecutionAncestor ?y; prov:producedFromServiceExecution ?z. ?x prov:stage "3"; ?propX ?obX. ?y prov:stage "4"; ?propY ?obY. ?z prov:stage "5"; ?propZ ?obZ. } <p />

  4. Find all invocations of procedure align_warp using a twelfth order nonlinear 1365 parameter model (see model menu describing possible values of parameter "-m 12" of align_warp) that ran on a Monday. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?x WHERE { ?z rdfs:label ?x; prov:dayOfWeekRun "Mon"; prov:hasTextInputParameters " -m -12 -q"; prov:serviceUsed <http://provenance.mindswap.org/provenance.owl#align_warp>. } <p />
  5. Find all Atlas Graphic images outputted from workflows where at least one of the input Anatomy Headers had an entry global maximum=4095. The contents of a header file can be extracted as text using the scanheader AIR utility.

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?z WHERE { ?x rdf:type <http://provenance.mindswap.org/provenance.owl#Graphic>; prov:hasFileAncestor ?f; rdfs:label ?z. ?f prov:annotation "maximum=4095". }
  6. Find all output averaged images of softmean (average) procedures, where the warped images taken as input were align_warped using a twelfth order nonlinear 1365 parameter model, i.e. "where softmean was preceded in the workflow, directly or indirectly, by an align_warp procedure with argument -m 12." PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?q WHERE { ?q prov:producedFromServiceExecution ?m; prov:hasFileAncestor ?y. ?m prov:serviceUsed <http://provenance.mindswap.org/provenance.owl#softmean>. ?y prov:producedFromServiceExecution ?z. ?z prov:hasTextInputParameters " -m -12 -q"; prov:serviceUsed <http://provenance.mindswap.org/provenance.owl#align_warp>. }
  7. A user has run the workflow twice, in the second instance replacing each procedures (convert) in the final stage with two procedures: pgmtoppm, then pnmtojpeg. Find the differences between the two workflow runs. The exact level of detail in the difference that is detected by a system is up to each participant.

    <p /> <p /> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?z1 ?z2 WHERE { <http://provenance.mindswap.org/files/WorkflowExecution-1157476801.94233.owl#WorkflowExecution-1157476801.94233> ?p1 ?x. ?x ?p2 ?z1. <http://provenance.mindswap.org/files/WorkflowExecution-1155932184.6428.owl#WorkflowExecution-1155932184.6428> ?p1 ?y. ?y ?p2 ?z2. FILTER (?z1 = ?z2) <p /> }

    Since we reference files by URI, the outputs will be very different. Each file output will differ, as well as the services used and inputs.

  8. A user has annotated some anatomy images with a key-value pair center=UChicago. Find the outputs of align_warp where the inputs are annotated with center=UChicago. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?y WHERE { ?x prov:annotation "center=UChicago". ?y prov:hasInputFile ?x; prov:serviceUsed <http://provenance.mindswap.org/provenance.owl#align_warp>. }
  9. A user has annotated some atlas graphics with key-value pair where the key is studyModality. Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all other annotations to these files. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX prov: <http://provenance.mindswap.org/provenance.owl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT DISTINCT ?xl ?z WHERE { ?x rdf:type prov:Graphic; rdfs:label ?xl. ?x prov:annotation ?z. ?x prov:annotation ?a FILTER ( ?a = "studyModality=speech" || ?a = "studyModality=audio" || ?a = "studyModality=visual" ). <p /> }

Suggested Wokflow Variants

Suggest variants of the workflow that can exhibit capabilities that your system support.

Suggested Queries

Suggest significant queries that your system can support and are not in the proposed list of queries, and how you have implemented/would implement them. These queries may be with regards to a variant of the workflow suggested above.

Categorisation of queries

According to your provenance approach, you may be able to provide a categorisation of queries. Can you elaborate on the categorisation and its rationale.

Live systems

The system can be accessed and tested in its current form at http://provenance.mindswap.org

Further Comments

Provide here further comments.

Conclusions

Provide here your conclusions on the challenge, and issues that you like to see discussed at a face to face meeting.

-- LucMoreau - 31 May 2006

-- JenniferGolbeck - 28 Jun 2006
to top


Copyright © 1999-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback