It is clear that the workflow has to be adapted to the WS-VLAM system for its composition and execution. When performing this adaptation, we faced already multiple ways to adjust the command line workflow to its WS-VLAM counter part. It is possible, for instance, to execute some processes by farming them or employ sequential processes instead; or to create a 'big' workflow or use composite workflows (hierarchical). Some of these initial alternatives are described in the slides contained in this powerpoint file.
The following figures present the different stages to generate the provenace. On one hand, the Figure 1 shows the WS-VLAM Composer with the design, or representation, of the workflow. On the other hand, the OPM Graph generated by PLIER can be seen as a XML format, in Figuere 2, or as a diagram, in Figure 3.
![]() |
![]() |
![]() |
Figure 1. WS-VLAM Composer | Figure 2. PLIER (XML Tree) | Figure 3. PLIER (Diagram) |
Similarly to the previous figures, the below representations show the WS-VLAM Composer, in Figure 4, and the PLIER tool, in Figure 5, with a more detailed workflow.
![]() |
![]() |
Figure 4. WS-VLAM Composer (detailed workflow) | Figure 5. PLIER (detailed workflow) |
Case (Job ID) | OPM Graphs (XML format) |
---|---|
J062941 | J062941-OPM.xml |
J062942 | J062942-OPM.xml |
J062943 | J062943-OPM.xml |
J062944 | J062944-OPM.xml |
J062945 | J062945-OPM.xml |
The PLIER repository is a relational database that is accessed by scientists indirectly through the GUI tool or programatically via the API. Basically, the PLIER repository handles the data but relies on the GUI end-user application to retrieve or query the information elements. These tools, however, are still under development and they may not be able to query rather browse the data. Although PLIER does not provide any low level manipulation mechanisms, it does not restrict the user from accessing through SQL commands. Thus, for the sake of providing some clarification, the OPM data will be then queried using both OQL and SQL.
Our solution first notices that the WS-VLAM system is agnostic of what the processes do. The WS-VLAM engine schedules and submits each process for execution, while monitor its progress. The module instead executes its specific task, as a black box, without being aware about the existance of WS-VLAM system. Second, PLIER is meant to provide provenance to the workflow as a whole and, at this moment, it does not consider the granulaity contained by the modules. Under these circumtances, if the given detection is treated by the any module it, the generated provenace data by itself is insufficient to answer the query.
We can interpret this query from another point of view. In our data provenance, the filenames are provided as parameters to the modules to generate the output (or the input). The parameters are expressed in the OPM Graph as Agents, having an ID
linking it to the detection and a data Value
. Thus, in order to retrieve the CSV files that contributed to a detection, we perform the following query:
SQL: select Agent.Value from AGENT where Agent.ID like '%Detection%'Which returns:
P2_J062945_B001_P2fits0_20081115_P2Detection.csvAnother possibility is to employ with a more general query that retrieves the CSV files participating in all events.
SQL: select * from AGENT where Agent.value like '%.csv'Which returns:
P2_J062945_B001_P2fits0_20081115_P2Detection.csv P2_J062945_B001_P2fits0_20081115_P2FrameMeta.csv P2_J062945_B001_P2fits0_20081115_P2ImageMeta.csv
Comment:
While implementing the necessary mechanisms to import and export information using the OPM model, based on the given XML Schema (http://openprovenance.org), we faced some inconveniences while parsing the XML tags. Therefore, we modified the schema to cope with those problems as well to match the database model from the repository. These changes to the original XML schema are summarized below:
account
since the definition for AccountId
and account
) and attribute (id
).
ProcessId
, AgentId
, or AgentId
.
In order to better clarify our points, we attached the revised XML schema OPMv101.revised.xsd
-- VictorGuevara - 02 Jun 2009
to top
I | Attachment ![]() | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|
![]() | PC3-WSVLAM-V1.png | manage | 76.0 K | 16 Apr 2009 - 09:27 | VictorGuevara | WS-VLAM Workflow |
![]() | PC3-WSVLAM-V2.png | manage | 89.7 K | 16 Apr 2009 - 09:28 | VictorGuevara | WS-VLAM Workflow - detailed |
![]() | wsvlam-PC3-1.pps | manage | 2505.5 K | 16 Apr 2009 - 12:46 | VictorGuevara | WS-VLAM Workflow Components |
![]() | PC3-Plier01.jpg | manage | 169.9 K | 29 May 2009 - 15:30 | VictorGuevara | Screnshoot of Plier Exchange |
![]() | PC3-Plier02.jpg | manage | 202.5 K | 29 May 2009 - 15:47 | VictorGuevara | Screnshoot of Plier Exchange (Diagram) |
![]() | PC3-model-wide.png | manage | 156.2 K | 02 Jun 2009 - 13:24 | VictorGuevara | WS-VLAM (Detailed workflow) |
![]() | PC3-model-wide-OPM.png | manage | 172.2 K | 02 Jun 2009 - 13:25 | VictorGuevara | PLIER GUI (Detailed workflow) |
![]() | PC3-model-wide-OPM.pdf | manage | 142.9 K | 02 Jun 2009 - 13:29 | VictorGuevara | OPM Graph (Detailed workflow [PDF]) |
![]() | PC3-j062941.xml | manage | 13.0 K | 02 Jun 2009 - 15:39 | VictorGuevara | Workflow j062941 |
![]() | PC3-j062942.xml | manage | 13.0 K | 02 Jun 2009 - 15:40 | VictorGuevara | Workflow j062942 |
![]() | PC3-j062943.xml | manage | 13.0 K | 02 Jun 2009 - 15:41 | VictorGuevara | Workflow j062943 |
![]() | PC3-j062944.xml | manage | 13.0 K | 02 Jun 2009 - 15:41 | VictorGuevara | Workflow j062944 |
![]() | PC3-j062945.xml | manage | 13.0 K | 02 Jun 2009 - 15:42 | VictorGuevara | Workflow j062945 |