The WfCommons Project¶
The WfCommons project is a community framework for enabling scientific workflow research and development by providing foundational tools for analyzing workflow execution instances, and generating synthetic, yet realistic, workflow instances that can be used to develop new techniques, algorithms and systems that can overcome the challenges of efficient and robust execution of ever larger workflows on increasingly complex distributed infrastructures. The figure below shows an overview of the workflow research life cycle process that integrates the three axis of the WfCommons project:
The first axis (Workflow Instances) of the WfCommons project targets the collection and curation of open access production workflow executions from various scientific applications shared in a common instance format (i.e., The WfCommons JSON Format). We keep a list of workflow execution instances in our project website.
The second axis (Workflow Generator) of the WfCommons project targets the generation of realistic synthetic workflow instances based on workflow execution profiles extracted from execution instances. We are constantly seeking for additional workflow execution instances for refining or developing new workflow recipes for the WfCommons’s workflow generator.
The third axis (Workflow Simulator) of the WfCommons project fosters the use of simulation for the development, evaluation, and verification of scheduling and resource provisioning algorithms (e.g., multi-objective function optimization, etc.), evaluation of current and emerging computing platforms (e.g., clouds, IoT, extreme scale, etc.), among others. We keep a list of open source workflow management systems simulators and simulation frameworks that provide support for the WfCommons JSON format in our project website.
This Python package provides a collection of tools for:
Analyzing instances of actual workflow executions;
Producing recipes structures for creating workflow recipes for workflow generation; and
Generating synthetic realistic workflow instances.
The WfCommons JSON Format¶
The WfCommons project uses a common format for representing workflow execution
instances and generated synthetic workflows instances, so that workflow simulators and
simulation frameworks (that provide support for WfCommons format) can use
such instances interchangeably. This common format uses a JSON specification
available in the
WfCommons JSON schema GitHub
repository. The current version of the WfCommons Python package uses the schema
1.1. The schema GitHub repository provides detailed explanation
of the WfCommons JSON format (including required fields), and also a validator
script for verifying the compatibility of instances.