Skip to topic | Skip to bottom


Create personal sidebar


Start of topic | Skip to actions

Minutes of Telephone Conversation between Mike (Chicago), Simon and Paul (Southampton)

Following a tutorial in which Paul and Simon described the data structures used in PASOA's provenance architecture, we discussed possible integration work between PASOA and VDS. The conversation was brief, and we agreed to talk further in early January. I have split the minutes up into three topics for clarity, though they overlap.

VDS data as process documentation

  • We discussed exposing data about the process of running a VDL workflow, i.e. the VDL itself plus the log data collected in the VDC, in a provenance store.
  • The task would be split into two stages: running a workflow as usual, then converting the data into the PASOA data structure and recording in a provenance store.
  • This would simulate an approach where the actors in a VDS recorded PASOA process documentation to a provenance store directly.
  • We can then show queries being made against the process documentation.


  • Mike will put fMRI VDL workflow(s) on the TWiki early January.
  • We will then talk again to plan the other stages.
  • As part of this work, a provenance store specifically for this experiment will be deployed at Southampton or Chicago.
  • Ideally, we will be in a position to show a query being run for the Open Science Grid All Hands Meeting on January 25th.

Paper on VDL-style provenance questions

  • If an e-science community has structured their applications as workflows using a particular environment or language (VDS, BPEL etc.), they will want to express their provenance questions in the same terms.
  • We would like to write a paper on asking provenance questions for an application described in VDL.
  • This could take the form of the "20 queries" posed in the paper Data Mining the SDSS Sky Server, where each question is given in English, then as a computer-parsable expression that can be run (SQL, in the latter paper's case).
  • This paper may be prepared for the IPAW workshop.


  • This should follow on from the fMRI work discussed above: we will identify questions in application terms for fMRI experiments, and show a mapping to concrete queries over the provenance store.

Paper on large-scale use of provenance

  • We discussed preparing a paper on recording and querying provenance produced by large workflows constructed and used by large e-science communities.
  • This would demonstrate the applicability of the provenance data structures in real significant use.
  • The workflows would again come from fMRI.


  • This should follow on from the fMRI work discsussed above: we will used larger workflows and involve fMRI scientists in writing provenance questions.
  • We would aim to get quantitative data by May.

Other Matters

  • We will also later discuss topics about managing process documentation, such as the possibility of garbage collection.

-- SimonMiles - 16 Dec 2005
to top

You are here: Soca > MeetingMinutes > MinutesMeeting16Dec2005

to top

Copyright © 2004 by the University of Southampton