The Fourth and Last Provenance Challenge
Early Success For the Fourth Provenance Challenge
Back at IPAW 2006, in Chicago, we discussed the needs for provenance standardization. This discussion initiated the successful Provenance Challenge series, with three editions PC1, PC2, and PC3 [1]. The Provenance Challenge activity was instrumental in designing the Open Provenance Model [2], a model for provenance with take-up well beyond the original Provenance Challenge community.
Events at the World Wide Consortium Incubator on Provenance [3] have taken over our initial move towards a last challenge PC4. The incubator group, including some of us, dedicated significant time towards making the Incubator a success. This resulted in the recent creation of the W3C Provenance Working Group, several of us are also involved in.
Since we all have a limited bandwidth for community activities, it is not realistic to run both activities in parallel. Furthermore, the inter-operability motivation of PC4 is being pursued by the W3C Provenance Working Group, with the rigor of a standardization body.
Given this, we feel it would be best to end the Provenance Challenge series, and declare its success in having set the agenda for provenance inter-operability. This would not be the end of this mailing list or the wiki. These facilities are welcome to be used for continued community building, announcements, etc.
The success of the challenge is really a testament to the vitality and dedication of this community.
Does it mean there is no more challenge related to provenance? Surely not, but they will simply be tackled in a different form.
Best regards,
Luc Moreau and Paul Groth
- http://twiki.ipaw.info/bin/view/Challenge/WebHome
- http://openprovenance.org/
- http://www.w3.org/2005/Incubator/prov/wiki/W3C_Provenance_Incubator_Group_Wiki
- http://www.w3.org/2011/prov/wiki/Main_Page
Motivation
The
FirstProvenanceChallenge (PC1) was designed to compare expressiveness of provenance systems. It was followed by the
SecondProvenanceChallenge (PC2) to exchange provenance information between systems. The consensus that followed led to a proposal for the
Open Provenance Model (OPM), a data model for provenance. OPM was tested during the
ThirdProvenanceChallenge (PC3). Following the success of this challenge, an open-source governance approach was adopted for OPM, which led to revision
OPM v1.1.
Three considerations are motivating the launch of a novel challenge:
- So far, the Provenance Challenge activity has had a strong focus on scientific workflows. While we certainly wish to keep the involvement of the scientific workflow community, we would like to demonstrate the broader applicability of provenance technology. For instance, it would be desirable to consider scenarios that involve users, where computations take place on the desktop and in the cloud, where various forms of artifacts are manipulated, e.g. data sets, files, documents, databases, and where artifacts are published and downloaded from the Web.
- Furthermore, there is no point capturing provenance if we do not make use of it. It would therefore be desirable to make use of provenance, to demonstrate functionality that would have been impossible to implement without provenance.
- Broader scenarios in which provenance is captured, and better exploitation of provenance to demonstrate functionality make use converge towards an end to end scenario, in which multiple technologies are involved, and really justifies the need for an interoperable solution.
Hence, the purpose of the
Fourth and Last Provenance Challenge is
to apply the Open Provenance Model to a broad end-to-end scenario, and demonstrate novel functionality that can only be achieved by the presence of an an interoperable solution for provenance. This challenge, the last one in this successful series, will be its natural conclusion since it will exploit
OPM in an end-to-end scenario, following steps understanding provenance (PC1), posing the problem of provenance inter-operability (PC2), and testing the
OPM solution (PC3).
In parallel, we note the activities of the
W3C Incubator on Provenance, which has collected use cases, derived requirements, and is in the process of beginning a technology roadmap. The Incubator and PC4 are complementary activities, which should cross-fertilize each other. Incubator's use cases and requirements can influence the PC4 scenario, whereas PC4 practical experience with
OPM can inform the incubator.
Whilst inter-operability is a pragmatic consideration, it entails fundamental research questions. The fourth challenge remains a research activity, and we will aim to disseminate results. Following PC1, a special issue was published by
Concurrenty and Computation: Practice and Experience. A special issue is under preparation for PC3 with the journal
Future Generation Computer Systems. It is proposed that a book of contributed articles will be published after PC4.
Proposed Process
As previously, the process, scope, aims and timetable will be community-driven. We propose here a process to bootstrap the PC4 activity, inspired by PC3.
- Phase 1: Scenario proposal: teams propose scenarios and provenance-based functionality the challenge could address.
- Phase 2: Scenario selection: proposed scenarios will be reviewed, discussed, refined, and put to a vote.
- Phase 3: Expressions of interest: to get a sense of the participating teams, and to help structure the challenge, teams will be invited to register their interest.
- Phase 4: PC4 scoping workshop.
Timetable
- Jul 15: Abstract Scenario
- August 30: Identify all the data flowing in the system with respect to the crystallography scenario (this can be mocked up) where possible we have example data:
- Nov 30: For each pattern of the process produce a mock-up of the opm graph with respect to the data in step 2 and make sure they stitch together
- Dec 15: Finalize queries with respect to scenario
- Feb 28: Import and implement queries over the mockup
- Feb 28: Generate and publish Provenance for each pattern
- Mar 30: Import and Implement Queries over the generated provenance
- Mar ??: Decide whether to do api compatibility
- June 1: Start Prepare slides for challenge
- June 10: PC4 Workshop (Collocate at SIGMOD on June 12 ??)
- Date TBC: Call for Papers for Book (The Provenance Challenge Experience)
PC4 Scoping Workshop
--
LucMoreau - 29 Mar 2010
--
PaulGroth - 12 Jul 2010
to top