|
@@ -18,129 +18,202 @@
|
|
|
Tutorial
|
|
|
========
|
|
|
|
|
|
-This tutorial show you the basic concept of *PyDolphinScheduler* and tell all
|
|
|
+This tutorial shows you the basic concept of *PyDolphinScheduler* and tells all
|
|
|
things you should know before you submit or run your first workflow. If you
|
|
|
-still not install *PyDolphinScheduler* and start Apache DolphinScheduler, you
|
|
|
-could go and see :ref:`how to getting start PyDolphinScheduler <start:getting started>`
|
|
|
+still have not installed *PyDolphinScheduler* and start DolphinScheduler, you
|
|
|
+could go and see :ref:`how to getting start PyDolphinScheduler <start:getting started>` firstly.
|
|
|
|
|
|
Overview of Tutorial
|
|
|
--------------------
|
|
|
|
|
|
-Here have an overview of our tutorial, and it look a little complex but do not
|
|
|
-worry about that because we explain this example below as detailed as possible.
|
|
|
+Here have an overview of our tutorial, and it looks a little complex but does not
|
|
|
+worry about that because we explain this example below as detail as possible.
|
|
|
|
|
|
-.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
- :start-after: [start tutorial]
|
|
|
- :end-before: [end tutorial]
|
|
|
+There are two types of tutorials: traditional and task decorator.
|
|
|
+
|
|
|
+- **Traditional Way**: More general, support many :doc:`built-in task types <tasks/index>`, it is convenient
|
|
|
+ when you build your workflow at the beginning.
|
|
|
+- **Task Decorator**: A Python decorator allow you to wrap your function into pydolphinscheduler's task. Less
|
|
|
+ versatility to the traditional way because it only supported Python functions and without build-in tasks
|
|
|
+ supported. But it is helpful if your workflow is all built with Python or if you already have some Python
|
|
|
+ workflow code and want to migrate them to pydolphinscheduler.
|
|
|
+
|
|
|
+.. tab:: Tradition
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start tutorial]
|
|
|
+ :end-before: [end tutorial]
|
|
|
+
|
|
|
+.. tab:: Task Decorator
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial_decorator.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start tutorial]
|
|
|
+ :end-before: [end tutorial]
|
|
|
|
|
|
Import Necessary Module
|
|
|
-----------------------
|
|
|
|
|
|
-First of all, we should importing necessary module which we would use later just
|
|
|
-like other Python package. We just create a minimum demo here, so we just import
|
|
|
-:class:`pydolphinscheduler.core.process_definition` and
|
|
|
-:class:`pydolphinscheduler.tasks.shell`.
|
|
|
+First of all, we should import the necessary module which we would use later just like other Python packages.
|
|
|
|
|
|
-.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
- :start-after: [start package_import]
|
|
|
- :end-before: [end package_import]
|
|
|
+.. tab:: Tradition
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start package_import]
|
|
|
+ :end-before: [end package_import]
|
|
|
|
|
|
-If you want to use other task type you could click and
|
|
|
-:doc:`see all tasks we support <tasks/index>`
|
|
|
+ In tradition tutorial we import :class:`pydolphinscheduler.core.process_definition.ProcessDefinition` and
|
|
|
+ :class:`pydolphinscheduler.tasks.shell.Shell`.
|
|
|
+
|
|
|
+ If you want to use other task type you could click and :doc:`see all tasks we support <tasks/index>`
|
|
|
+
|
|
|
+.. tab:: Task Decorator
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial_decorator.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start package_import]
|
|
|
+ :end-before: [end package_import]
|
|
|
+
|
|
|
+ In task decorator tutorial we import :class:`pydolphinscheduler.core.process_definition.ProcessDefinition` and
|
|
|
+ :func:`pydolphinscheduler.tasks.func_wrap.task`.
|
|
|
|
|
|
Process Definition Declaration
|
|
|
------------------------------
|
|
|
|
|
|
-We should instantiate object after we import them from `import necessary module`_.
|
|
|
-Here we declare basic arguments for process definition(aka, workflow). We define
|
|
|
-the name of process definition, using `Python context manager`_ and it
|
|
|
-**the only required argument** for object process definition. Beside that we also
|
|
|
-declare three arguments named `schedule`, `start_time` which setting workflow schedule
|
|
|
-interval and schedule start_time, and argument `tenant` which changing workflow's
|
|
|
-task running user in the worker, :ref:`section tenant <concept:tenant>` in *PyDolphinScheduler*
|
|
|
-:doc:`concept` page have more detail information.
|
|
|
+We should instantiate :class:`pydolphinscheduler.core.process_definition.ProcessDefinition` object after we
|
|
|
+import them from `import necessary module`_. Here we declare basic arguments for process definition(aka, workflow).
|
|
|
+We define the name of :code:`ProcessDefinition`, using `Python context manager`_ and it **the only required argument**
|
|
|
+for `ProcessDefinition`. Besides, we also declare three arguments named :code:`schedule` and :code:`start_time`
|
|
|
+which setting workflow schedule interval and schedule start_time, and argument :code:`tenant` defines which tenant
|
|
|
+will be running this task in the DolphinScheduler worker. See :ref:`section tenant <concept:tenant>` in
|
|
|
+*PyDolphinScheduler* :doc:`concept` for more information.
|
|
|
|
|
|
-.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
- :start-after: [start workflow_declare]
|
|
|
- :end-before: [end workflow_declare]
|
|
|
+.. tab:: Tradition
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start workflow_declare]
|
|
|
+ :end-before: [end workflow_declare]
|
|
|
+
|
|
|
+.. tab:: Task Decorator
|
|
|
|
|
|
-We could find more detail about process definition in
|
|
|
-:ref:`concept about process definition <concept:process definition>` if you interested in it.
|
|
|
-For all arguments of object process definition, you could find in the
|
|
|
-:class:`pydolphinscheduler.core.process_definition` api documentation.
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial_decorator.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start workflow_declare]
|
|
|
+ :end-before: [end workflow_declare]
|
|
|
+
|
|
|
+We could find more detail about :code:`ProcessDefinition` in :ref:`concept about process definition <concept:process definition>`
|
|
|
+if you are interested in it. For all arguments of object process definition, you could find in the
|
|
|
+:class:`pydolphinscheduler.core.process_definition` API documentation.
|
|
|
|
|
|
Task Declaration
|
|
|
----------------
|
|
|
|
|
|
-Here we declare four tasks, and bot of them are simple task of
|
|
|
-:class:`pydolphinscheduler.tasks.shell` which running `echo` command in terminal.
|
|
|
-Beside the argument `command`, we also need setting argument `name` for each task *(not
|
|
|
-only shell task, `name` is required for each type of task)*.
|
|
|
+.. tab:: Tradition
|
|
|
|
|
|
-.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
- :dedent: 0
|
|
|
- :start-after: [start task_declare]
|
|
|
- :end-before: [end task_declare]
|
|
|
+ We declare four tasks to show how to create tasks, and both of them are simple tasks of
|
|
|
+ :class:`pydolphinscheduler.tasks.shell` which runs `echo` command in the terminal. Besides the argument
|
|
|
+ `command` with :code:`echo` command, we also need to set the argument `name` for each task
|
|
|
+ *(not only shell task, `name` is required for each type of task)*.
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start task_declare]
|
|
|
+ :end-before: [end task_declare]
|
|
|
+
|
|
|
+ Besides shell task, *PyDolphinScheduler* supports multiple tasks and you could find in :doc:`tasks/index`.
|
|
|
+
|
|
|
+.. tab:: Task Decorator
|
|
|
|
|
|
-Beside shell task, *PyDolphinScheduler* support multiple tasks and you could
|
|
|
-find in :doc:`tasks/index`.
|
|
|
+ We declare four tasks to show how to create tasks, and both of them are created by the task decorator which
|
|
|
+ using :func:`pydolphinscheduler.tasks.func_wrap.task`. All we have to do is add a decorator named
|
|
|
+ :code:`@task` to existing Python function, and then use them inside :class:`pydolphinscheduler.core.process_definition`
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial_decorator.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start task_declare]
|
|
|
+ :end-before: [end task_declare]
|
|
|
+
|
|
|
+ It makes our workflow more Pythonic, but be careful that when we use task decorator mode mean we only use
|
|
|
+ Python function as a task and could not use the :doc:`built-in tasks <tasks/index>` most of the cases.
|
|
|
|
|
|
Setting Task Dependence
|
|
|
-----------------------
|
|
|
|
|
|
-After we declare both process definition and task, we have one workflow with
|
|
|
-four tasks, both all tasks is independent so that they would run in parallel.
|
|
|
-We should reorder the sort and the dependence of tasks. It useful when we need
|
|
|
-run prepare task before we run actual task or we need tasks running is specific
|
|
|
-rule. We both support attribute `set_downstream` and `set_upstream`, or bitwise
|
|
|
-operators `>>` and `<<`.
|
|
|
+After we declare both process definition and task, we have four tasks that are independent and will be running
|
|
|
+in parallel. If you want to start one task until some task is finished, you have to set dependence on those
|
|
|
+tasks.
|
|
|
|
|
|
-In this example, we set task `task_parent` is the upstream task of task
|
|
|
-`task_child_one` and `task_child_two`, and task `task_union` is the downstream
|
|
|
-task of both these two task.
|
|
|
+Set task dependence is quite easy by task's attribute :code:`set_downstream` and :code:`set_upstream` or by
|
|
|
+bitwise operators :code:`>>` and :code:`<<`
|
|
|
|
|
|
-.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
- :dedent: 0
|
|
|
- :start-after: [start task_relation_declare]
|
|
|
- :end-before: [end task_relation_declare]
|
|
|
+In this tutorial, task `task_parent` is the leading task of the whole workflow, then task `task_child_one` and
|
|
|
+task `task_child_two` are its downstream tasks. Task `task_union` will not run unless both task `task_child_one`
|
|
|
+and task `task_child_two` was done, because both two task is `task_union`'s upstream.
|
|
|
+
|
|
|
+.. tab:: Tradition
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start task_relation_declare]
|
|
|
+ :end-before: [end task_relation_declare]
|
|
|
|
|
|
-Please notice that we could grouping some tasks and set dependence if they have
|
|
|
-same downstream or upstream. We declare task `task_child_one` and `task_child_two`
|
|
|
-as a group here, named as `task_group` and set task `task_parent` as upstream of
|
|
|
-both of them. You could see more detail in :ref:`concept:Tasks Dependence` section in concept
|
|
|
-documentation.
|
|
|
+.. tab:: Task Decorator
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial_decorator.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start task_relation_declare]
|
|
|
+ :end-before: [end task_relation_declare]
|
|
|
+
|
|
|
+.. note::
|
|
|
+
|
|
|
+ We could set task dependence in batch mode if they have the same downstream or upstream by declaring those
|
|
|
+ tasks as task groups. In tutorial, We declare task `task_child_one` and `task_child_two` as task group named
|
|
|
+ `task_group`, then set `task_group` as downstream of task `task_parent`. You could see more detail in
|
|
|
+ :ref:`concept:Tasks Dependence` for more detail about how to set task dependence.
|
|
|
|
|
|
Submit Or Run Workflow
|
|
|
----------------------
|
|
|
|
|
|
-Now we finish our workflow definition, with task and task dependence, but all
|
|
|
-these things are in local, we should let Apache DolphinScheduler daemon know what we
|
|
|
-define our workflow. So the last thing we have to do here is submit our workflow to
|
|
|
-Apache DolphinScheduler daemon.
|
|
|
+After that, we finish our workflow definition, with four tasks and task dependence, but all these things are
|
|
|
+local, we should let the DolphinScheduler daemon know how the definition of workflow. So the last thing we
|
|
|
+have to do is submit the workflow to the DolphinScheduler daemon.
|
|
|
|
|
|
-We here in the example using `ProcessDefinition` attribute `run` to submit workflow
|
|
|
-to the daemon, and set the schedule time we just declare in `process definition declaration`_.
|
|
|
+Fortunately, we have a convenient method to submit workflow via `ProcessDefinition` attribute :code:`run` which
|
|
|
+will create workflow definition as well as workflow schedule.
|
|
|
|
|
|
-Now, we could run the Python code like other Python script, for the basic usage run
|
|
|
-:code:`python tutorial.py` to trigger and run it.
|
|
|
+.. tab:: Tradition
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start submit_or_run]
|
|
|
+ :end-before: [end submit_or_run]
|
|
|
|
|
|
-.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
- :dedent: 0
|
|
|
- :start-after: [start submit_or_run]
|
|
|
- :end-before: [end submit_or_run]
|
|
|
+.. tab:: Task Decorator
|
|
|
+
|
|
|
+ .. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial_decorator.py
|
|
|
+ :dedent: 0
|
|
|
+ :start-after: [start submit_or_run]
|
|
|
+ :end-before: [end submit_or_run]
|
|
|
+
|
|
|
+At last, we could execute this workflow code in your terminal like other Python scripts, running
|
|
|
+:code:`python tutorial.py` to trigger and execute it.
|
|
|
+
|
|
|
+.. note::
|
|
|
|
|
|
-If you not start your Apache DolphinScheduler server, you could find the way in
|
|
|
-:ref:`start:start Python gateway service` and it would have more detail about related server
|
|
|
-start. Beside attribute `run`, we have attribute `submit` for object `ProcessDefinition`
|
|
|
-and it just submit workflow to the daemon but not setting the schedule information. For
|
|
|
-more detail you could see :ref:`concept:process definition`.
|
|
|
+ If you do not start your DolphinScheduler API server, you could find how to start it in
|
|
|
+ :ref:`start:start Python gateway service` for more detail. Besides attribute :code:`run`, we have attribute
|
|
|
+ :code:`submit` for object `ProcessDefinition` which just submits workflow to the daemon but does not set
|
|
|
+ the workflow schedule information. For more detail, you could see :ref:`concept:process definition`.
|
|
|
|
|
|
DAG Graph After Tutorial Run
|
|
|
----------------------------
|
|
|
|
|
|
-After we run the tutorial code, you could login Apache DolphinScheduler web UI,
|
|
|
-go and see the `DolphinScheduler project page`_. they is a new process definition be
|
|
|
-created and named "Tutorial". It create by *PyDolphinScheduler* and the DAG graph as below
|
|
|
+After we run the tutorial code, you could log in DolphinScheduler web UI, go and see the
|
|
|
+`DolphinScheduler project page`_. They is a new process definition be created by *PyDolphinScheduler* and it
|
|
|
+named "tutorial" or "tutorial_decorator". The task graph of workflow like below:
|
|
|
|
|
|
.. literalinclude:: ../../src/pydolphinscheduler/examples/tutorial.py
|
|
|
:language: text
|