wfcommons.common

wfcommons.common.file

class wfcommons.common.file.File(name: str, size: int, link: FileLink, logger: Logger | None = None)

Bases: object

Representation of a file.

Parameters:
  • name (str) – The name of the file.

  • size (int) – File size in bytes.

  • link (FileLink) – Type of file link.

  • logger (Optional[Logger]) – The logger where to log information/warning or errors.

as_dict() Dict[str, str | int | FileLink]

A JSON representation of the file.

Returns:

A JSON object representation of the file.

Return type:

Dict[str, Union[str, int, FileLink]]

Bases: NoValue

Type of file link.

INPUT = 'input'
OUTPUT = 'output'

wfcommons.common.task

class wfcommons.common.task.Task(name: str, task_type: TaskType, runtime: float, cores: float = 1.0, task_id: str | None = None, category: str | None = None, machine: Machine | None = None, program: str | None = None, args: List[str] | None = None, avg_cpu: float | None = None, bytes_read: int | None = None, bytes_written: int | None = None, memory: int | None = None, energy: int | None = None, avg_power: float | None = None, priority: int | None = None, files: List[File] | None = None, logger: Logger | None = None, launch_dir: str | None = None, start_time: str | None = None)

Bases: object

Representation of a task.

Parameters:
  • name (str) – The name of the task.

  • task_type (TaskType) – The type of the task.

  • runtime (float) – Task runtime in seconds.

  • cores (float) – Number of cores required by the task.

  • task_id (Optional[str]) – Task unique ID (e.g., ID0000001).

  • category (Optional[str]) – Task category (can be used, for example, to define tasks that use the same program).

  • machine (Optional[Machine]) – Machine on which is the task has been executed.

  • program (Optional[str]) – Program name.

  • args (Optional[List[str]]) – List of task arguments.

  • avg_cpu (Optional[float]) – Average CPU utilization in %.

  • bytes_read (Optional[int]) – Total bytes read in KB.

  • bytes_written (Optional[int]) – Total bytes written in KB.

  • memory (Optional[int]) – Memory (resident set) size of the process in bytes.

  • energy (Optional[int]) – Total energy consumption in kWh.

  • avg_power (Optional[float]) – Average power consumption in W.

  • priority (Optional[int]) – Task priority.

  • files (Optional[List[File]]) – List of input/output files used by the task.

  • logger (Optional[Logger]) – The logger where to log information/warning or errors.

as_dict() Dict

A JSON representation of the task.

Returns:

A JSON object representation of the task.

Return type:

Dict

class wfcommons.common.task.TaskType(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: NoValue

Task type.

AUXILIARY = 'auxiliary'
COMPUTE = 'compute'
SUBWORKFLOW = 'subworkflow'
TRANSFER = 'transfer'

wfcommons.common.machine

class wfcommons.common.machine.Machine(name: str, cpu: Dict[str, int | str], system: MachineSystem | None = None, architecture: str | None = None, memory: int | None = None, release: str | None = None, hashcode: str | None = None, logger: Logger | None = None)

Bases: object

Representation of one compute machine.

Parameters:
  • name (str) – Machine node name.

  • cpu (Dict[str, Union[int, str]]) –

    A dictionary containing information about the CPU specification. Must at least contains two fields: count (number of CPU cores) and speed (CPU speed of each core in MHz).

    cpu = {
        'count': 48,
        'speed': 1200
    }
    

  • system (MachineSystem) – Machine system (linux, macos, windows).

  • architecture (str) – Machine architecture (e.g., x86_64, ppc).

  • memory (int) – Total machine’s RAM memory in bytes.

  • release (str) – Machine release.

  • hashcode (str) – MD5 Hashcode for the Machine.

  • logger (Logger) – The logger where to log information/warning or errors.

as_dict() Dict[str, int | str]

A JSON representation of the machine.

Returns:

A JSON object representation of the machine.

Return type:

Dict[str, Union[int, str]]

class wfcommons.common.machine.MachineSystem(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: NoValue

Machine system type.

LINUX = 'linux'
MACOS = 'macos'
WINDOWS = 'windows'

wfcommons.common.workflow

class wfcommons.common.workflow.Workflow(name: str | None = 'workflow', description: str | None = None, wms_name: str | None = None, wms_version: str | None = None, wms_url: str | None = None, executed_at: str | None = None, makespan: int | None = 0.0)

Bases: DiGraph

Representation of a workflow. The workflow representation is an extension of the NetworkX DiGraph class.

Parameters:
  • name (str) – Workflow name.

  • description (Optional[str]) – Workflow instance description.

  • wms_name (Optional[str]) – WMS name.

  • wms_version (Optional[str]) – WMS version.

  • wms_url (Optional[str]) – URL for the WMS website.

  • executed_at (Optional[str]) – Workflow start timestamp in the ISO 8601 format.

  • makespan (Optional[int]) – Workflow makespan in seconds.

add_dependency(parent: str, child: str) None

Add a dependency between tasks.

Parameters:
  • parent (str) – Parent task name.

  • child (str) – Child task name.

add_task(task: Task) None

Add a Task to the workflow.

Parameters:

task (Task) – A Task object.

leaves() List[Task]
read_dot(dot_file_path: Path | None = None) None

Read a dot file of the workflow instance.

Parameters:

dot_file_path (Optional[pathlib.Path]) – DOT input file name.

roots() List[Task]
to_nx_digraph() DiGraph
write_dot(dot_file_path: Path | None = None) None

Write a dot file of the workflow instance.

Parameters:

dot_file_path (Optional[pathlib.Path]) – DOT output file name.

write_json(json_file_path: Path | None = None) None

Write a JSON file of the workflow instance.

Parameters:

json_file_path (Optional[pathlib.Path]) – JSON output file name.