PLIER: Provenance Layer Infrastructure for e-Science Resources
PLIER is being developed by the Information Management Group within the
Big Grid project.
PLIER provides an implementation of the Open Provenance Model (OPM), and hides the details of handling and managing the provenance when executing workflows. Its API allows the developers to build, store and share (by XML serialization) graphs using the OPM model.
Three main components constitutes the Plier API:
- Implementing the most optimum OPM relational database schema,
- Building the OPM objects and storing them into the database, using the Plier API,
- Manipulate and sharing OPM data and graphs,using the Plier Toolbox,
- Practical Steps: How to?
div class="twikiToc"&
-->
These components are described in more details, within the following sections:
1. Implementing the most optimum OPM relational database schema: OPMRDB
PLIER provides the most optimal OPM database schema for relational DBMSs. The database schema could be created by the PLIER API, or by using the pre-defined DDL script. The script to create the database schema for 'MySQL', is available at:
OpmDDL.
Figure 1 illustrate the Entity Relationship diagram for OPM as designed and implemented within PLIER.
A more detailed description of the database schema implementation, illustrated with graphs, is available at:
http://www.nikhef.nl/~ammarb/OPMRDB_1.1/
|
Figure 1. ER Diagram for OPM Relational Schema |
2. Building the OPM objects and storing them into the database, using the Plier API
The Plier API is implemented using the most recent standards and mechanisms in databases and Object-Relational Mapping ORM.
- PLIER provides the most optimal OPM-compliant database schema for relational DBMSs (OpmDDL)
- JDO 3.1 is used as a java-centric API to access persistent data,
- Hibernate is used as a reference ORM implementation of the JDO API,
- A step-by-step example illustrating how to create and store OPM data, using Plier API: OpmDBTest
- A complete java source code example, using Plier API: OpmDBTestSample (Victoria Sponge Cake Provenance).
3. Using the Plier Toolbox to manipulate and share OPM data and graphs
Plier Toolbox is a java-based interface through which the end-user can access the information about the performed experiments and interpret the results. It deploys the provenance repository as a back-end to query and search inside the scientific experiments in a user-friendly manner.
The
Plier Toolbox combines the power of the database querying functionalities and the rich graphical representation of experiments it generates; thus, facilitating the navigation and exploration through scientific-workflows with potentials for data sharing.
4. Practical Steps: How to?
This section explains, in brief, the practical steps of deploying the Plier API:
- Implement the provenance database into 'MySQL' DBMS, using OpmDDL script
- Download the latest version of Plier Core API
- please update the configuration file 'hibernate.cfg.xml', so that it points to your own database with proper user name and password.
- Integrate the Plier API into your workflow system (or application)
- check the step-by-step example (OpmDBTest) and complete java source code example (OpmDBTestSample) examples given above.
- Eventually, run some pre-compute scripts (SummaryScript), against the database, to calculate summary data.
- Use the Plier Toolbox to browse through the provenance data.
- please update the configuration file 'db.cfg.properties', so that it uses your own database with proper user name and password.
--
AmmarBenabdelkader - 07 Jan 2011
to top