123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218 |
- .. Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
- .. http://www.apache.org/licenses/LICENSE-2.0
- .. Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
- Configuration
- =============
- pydolphinscheduler has a built-in module setting necessary configuration to start and run your workflow code.
- You could directly use them if you only want to run a quick start or for a simple job like POC. But if you
- want to deep use pydolphinscheduler and even use it in production. You should probably need to modify and
- change the built-in configuration.
- We have two ways to modify the configuration:
- - `Using Environment Variables`_: The more lightweight way to modify the configuration. it is useful in
- containerization scenarios, like docker and k8s, or when you like to temporarily override configs in the
- configuration file.
- - `Using Configuration File`_: The more general way to modify the configuration. It is useful when you want
- to persist and manage configuration files in one single file.
- Using Environment Variables
- ---------------------------
- You could change the configuration by adding or modifying the operating system's environment variables. No
- matter what way you used, as long as you can successfully modify the environment variables. We use two common
- ways, `Bash <by bash>`_ and `Python OS Module <by python os module>`_, as examples:
- By Bash
- ^^^^^^^
- Setting environment variables via `Bash` is the most straightforward and easiest way. We give some examples about
- how to change them by Bash.
- .. code-block:: bash
- # Modify Java Gateway Address
- export PYDS_JAVA_GATEWAY_ADDRESS="192.168.1.1"
- # Modify Workflow Default User
- export PYDS_WORKFLOW_USER="custom-user"
- After executing the commands above, both ``PYDS_JAVA_GATEWAY_ADDRESS`` and ``PYDS_WORKFLOW_USER`` will be changed.
- The next time you execute and submit your workflow, it will submit to host `192.168.1.1`, and with workflow's user
- named `custom-user`.
- By Python OS Module
- ^^^^^^^^^^^^^^^^^^^
- pydolphinscheduler is a Python API for Apache DolphinScheduler, and you could modify or add system environment
- variables via Python ``os`` module. In this example, we change variables as the same value as we change in
- `Bash <by bash>`_. It will take effect the next time you run your workflow, and call workflow ``run`` or ``submit``
- method next to ``os.environ`` statement.
- .. code-block:: python
- import os
- # Modify Java Gateway Address
- os.environ["PYDS_JAVA_GATEWAY_ADDRESS"] = "192.168.1.1"
- # Modify Workflow Default User
- os.environ["PYDS_WORKFLOW_USER"] = "custom-user"
- All Configurations in Environment Variables
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- All environment variables as below, and you could modify their value via `Bash <by bash>`_ or `Python OS Module <by python os module>`_
- +------------------+------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | Variable Section | Variable Name | description |
- +==================+====================================+====================================================================================================================+
- | | ``PYDS_JAVA_GATEWAY_ADDRESS`` | Default Java gateway address, will use its value when it is set. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | Java Gateway | ``PYDS_JAVA_GATEWAY_PORT`` | Default Java gateway port, will use its value when it is set. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_JAVA_GATEWAY_AUTO_CONVERT`` | Default boolean Java gateway auto convert, will use its value when it is set. |
- +------------------+------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_USER_NAME`` | Default user name, will use when user's ``name`` when does not specify. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_USER_PASSWORD`` | Default user password, will use when user's ``password`` when does not specify. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | Default User | ``PYDS_USER_EMAIL`` | Default user email, will use when user's ``email`` when does not specify. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_USER_PHONE`` | Default user phone, will use when user's ``phone`` when does not specify. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_USER_STATE`` | Default user state, will use when user's ``state`` when does not specify. |
- +------------------+------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_PROJECT`` | Default workflow project name, will use its value when workflow does not specify the attribute ``project``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_TENANT`` | Default workflow tenant, will use its value when workflow does not specify the attribute ``tenant``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | Default Workflow | ``PYDS_WORKFLOW_USER`` | Default workflow user, will use its value when workflow does not specify the attribute ``user``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_QUEUE`` | Default workflow queue, will use its value when workflow does not specify the attribute ``queue``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_WORKER_GROUP`` | Default workflow worker group, will use its value when workflow does not specify the attribute ``worker_group``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_RELEASE_STATE`` | Default workflow release state, will use its value when workflow does not specify the attribute ``release_state``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_TIME_ZONE`` | Default workflow worker group, will use its value when workflow does not specify the attribute ``timezone``. |
- + +------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- | | ``PYDS_WORKFLOW_WARNING_TYPE`` | Default workflow warning type, will use its value when workflow does not specify the attribute ``warning_type``. |
- +------------------+------------------------------------+--------------------------------------------------------------------------------------------------------------------+
- .. note::
- The scope of setting configuration via environment variable is in the workflow, and it will not change the
- value of the configuration file. The :doc:`CLI <cli>` command ``config --get`` and ``config --set`` operate
- the value of the configuration file, so the command ``config --get`` may return a different value from what
- you set in the environment variable, and command ``config --get`` will never change your environment variable.
- Using Configuration File
- ------------------------
- If you want to persist and manage configuration in a file instead of environment variables, or maybe you want
- want to save your configuration file to a version control system, like Git or SVN, and the way to change
- configuration by file is the best choice.
- Export Configuration File
- ^^^^^^^^^^^^^^^^^^^^^^^^^
- pydolphinscheduler allows you to change the built-in configurations via CLI or editor you like. pydolphinscheduler
- integrated built-in configurations in its package, but you could also export it locally by CLI
- .. code-block:: bash
- pydolphinscheduler config --init
- And it will create a new YAML file in the path `~/pydolphinscheduler/config.yaml` by default. If you want to export
- it to another path, you should set `PYDS_HOME` before you run command :code:`pydolphinscheduler config --init`.
- .. code-block:: bash
- export PYDS_HOME=<CUSTOM_PATH>
- pydolphinscheduler config --init
- After that, your configuration file will export into `<CUSTOM_PATH>/config.yaml` instead of the default path.
- Change Configuration
- ^^^^^^^^^^^^^^^^^^^^
- In section `export configuration file`_ you export the configuration file locally, and as a local file, you could
- edit it with any editor you like. After you save your change in your editor, the latest configuration will work
- when you run your workflow code.
- You could also query or change the configuration via CLI :code:`config --get <config>` or :code:`config --get <config> <val>`.
- Both `--get` and `--set` could be called one or more times in single command, and you could only set the leaf
- node of the configuration but could get the parent configuration, there are simple examples below:
- .. code-block:: bash
- # Get single configuration in the leaf node,
- # The output look like below:
- # java_gateway.address = 127.0.0.1
- pydolphinscheduler config --get java_gateway.address
- # Get multiple configuration in the leaf node,
- # The output look like below:
- # java_gateway.address = 127.0.0.1
- # java_gateway.port = 25333
- pydolphinscheduler config --get java_gateway.address --get java_gateway.port
- # Get parent configuration which contain multiple leaf nodes,
- # The output look like below:
- # java_gateway = ordereddict([('address', '127.0.0.1'), ('port', 25333), ('auto_convert', True)])
- pydolphinscheduler config --get java_gateway
- # Set single configuration,
- # The output look like below:
- # Set configuration done.
- pydolphinscheduler config --set java_gateway.address 192.168.1.1
- # Set multiple configuration
- # The output look like below:
- # Set configuration done.
- pydolphinscheduler config --set java_gateway.address 192.168.1.1 --set java_gateway.port 25334
- # Set configuration not in leaf node will fail
- # The output look like below:
- # Raise error.
- pydolphinscheduler config --set java_gateway 192.168.1.1,25334,True
- For more information about our CLI, you could see document :doc:`cli`.
- All Configurations in File
- ^^^^^^^^^^^^^^^^^^^^^^^^^^
- Here are all our configurations for pydolphinscheduler.
- .. literalinclude:: ../../src/pydolphinscheduler/core/default_config.yaml
- :language: yaml
- :lines: 18-
- Priority
- --------
- We have two ways to modify the configuration and there is a built-in config in pydolphinscheduler too. It is
- very important to understand the priority of the configuration when you use them. The overview of configuration
- priority is.
- ``Environment Variables > Configurations File > Built-in Configurations``
- This means that your setting in environment variables or configurations file will overwrite the built-in one.
- And you could temporarily modify configurations by setting environment variables without modifying the global
- config in the configuration file.
|