Overview
The WeData data development module supports two scheduling modes: task scheduling and workflow scheduling. This document helps you understand the core differences between the two modes, enabling you to make an appropriate choice when you create a project.
Note:
A project supports only one scheduling mode. Once configured, the mode cannot be modified. Choose carefully.
Tasks in task scheduling mode and workflow scheduling mode cannot depend on each other. This means that dependencies cannot be established between projects that use different scheduling modes.
Essential Differences
Different Scheduling Granularities
Workflow scheduling: It operates at the granularity of the workflow as a whole. Scheduling configurations are unified at the workflow level, and all tasks within a workflow are scheduled and run as a single unit.
Task scheduling: It operates at the granularity of individual tasks. Each task is configured with its own scheduling policy and can have a different scheduling cycle.
Different Instance Uniqueness
Task scheduling: For a given task, the instance at a specific scheduled time is unique. Periodic scheduling, backfilling, and reruns share the same instance space, and there is an association between these three operations.
Workflow scheduling: At the same scheduled time, multiple run instances may exist. Periodic runs, manual runs, and reruns are independent of each other and do not interfere with one another.
Feature Differences
|
Cross-workflow Dependency | Supported | Not supported. |
Cross-project Dependency | Supported | Not supported. |
Nested Workflow | Not supported. | Supported (as an alternative to cross-workflow dependency). The nested workflow runs by following the scheduling configuration of the external workflow. |
Backfill | Supported. Backfill is associated with periodic scheduling instances. | Not supported. Use "Run" (manual trigger) as an alternative, which generates an independent run instance. |
Scheduling Configuration Granularity | Task granularity. Each task can independently set its scheduling cycle and dependency policy. | Workflow-level granularity. Supports scheduled triggering and file arrival triggering. If scheduling is not configured, it defaults to manual triggering. |
Standard Mode | Supported | Not supported. |
Cross-project Clone | Supported | Not supported. |
Import/Export | Supported | Not supported. |
CI/CD | Supported | Supported |
O&M Dashboard | Supported | Not supported. |
Baseline Monitoring | Supported | Not supported. |
Ops Granularity | Task-instance-centric | Centered on workflow run records, with the ability to drill down to the task level. |
Pause/Decommission/Start | Supported | Not supported. |
Alarm object | Task-level alarm | Workflow-level and task-level alarms. |
Scheduling Mode Selection Recommendations
Select task scheduling:
The scheduling cycles between tasks vary significantly.
Requires dependencies across workflows and projects.
Select workflow scheduling:
A group of tasks shares the same scheduling cycle and must be scheduled as a unified whole.
Manage scheduling relationships between multiple workflows by using nested workflows.
Simplify scheduling configurations.
Terms
Cross-workflow dependency: In task scheduling mode, a cross-workflow dependency refers to the ability to establish dependencies between tasks from two different workflows.
Cross-project dependency: In task scheduling mode, a cross-project dependency refers to the ability to establish dependencies between tasks from two different projects.
Nested workflow: In workflow scheduling mode, a nested workflow is a task type. It refers to creating a task within workflow A, where this task is a nested workflow task and you can select a workflow B as the nested workflow. When workflow A runs, it triggers the execution of the nested workflow task (that is, workflow B). Workflow B can also be configured and scheduled to run independently.