start.rst 6.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163
  1. .. Licensed to the Apache Software Foundation (ASF) under one
  2. or more contributor license agreements. See the NOTICE file
  3. distributed with this work for additional information
  4. regarding copyright ownership. The ASF licenses this file
  5. to you under the Apache License, Version 2.0 (the
  6. "License"); you may not use this file except in compliance
  7. with the License. You may obtain a copy of the License at
  8. .. http://www.apache.org/licenses/LICENSE-2.0
  9. .. Unless required by applicable law or agreed to in writing,
  10. software distributed under the License is distributed on an
  11. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  12. KIND, either express or implied. See the License for the
  13. specific language governing permissions and limitations
  14. under the License.
  15. Getting Started
  16. ===============
  17. To get started with *PyDolphinScheduler* you must ensure python and pip
  18. installed on your machine, if you're already set up, you can skip straight
  19. to `Installing PyDolphinScheduler`_, otherwise please continue with
  20. `Installing Python`_.
  21. Installing Python
  22. -----------------
  23. How to install `python` and `pip` depends on what operating system
  24. you're using. The python wiki provides up to date
  25. `instructions for all platforms here`_. When you entering the website
  26. and choice your operating system, you would be offered the choice and
  27. select python version. *PyDolphinScheduler* recommend use version above
  28. Python 3.6 and we highly recommend you install *Stable Releases* instead
  29. of *Pre-releases*.
  30. After you have download and installed Python, you should open your terminal,
  31. typing and running :code:`python --version` to check whether the installation
  32. is correct or not. If all thing good, you could see the version in console
  33. without error(here is a example after Python 3.8.7 installed)
  34. .. code-block:: bash
  35. python --version
  36. Will see detail of Python version, such as *Python 3.8.7*
  37. Installing PyDolphinScheduler
  38. -----------------------------
  39. After Python is already installed on your machine following section
  40. `installing Python`_, it easy to *PyDolphinScheduler* by pip.
  41. .. code-block:: bash
  42. python -m pip install apache-dolphinscheduler
  43. The latest version of *PyDolphinScheduler* would be installed after you run above
  44. command in your terminal. You could go and `start Python Gateway Service`_ to finish
  45. the prepare, and then go to :doc:`tutorial` to make your hand dirty. But if you
  46. want to install the unreleased version of *PyDolphinScheduler*, you could go and see
  47. section `installing PyDolphinScheduler in dev branch`_ for more detail.
  48. .. note::
  49. Currently, we released multiple pre-release package in PyPI, you can see all released package
  50. including pre-release in `release history <https://pypi.org/project/apache-dolphinscheduler/#history>`_.
  51. You can fix the the package version if you want to install pre-release package, for example if
  52. you want to install version `3.0.0-beta-2` package, you can run command
  53. :code:`python -m pip install apache-dolphinscheduler==3.0.0b2`.
  54. Installing PyDolphinScheduler In DEV Branch
  55. -------------------------------------------
  56. Because the project is developing and some of the features still not release.
  57. If you want to try some thing unreleased you could install from the source code
  58. which we hold in GitHub
  59. .. code-block:: bash
  60. # Clone Apache DolphinScheduler repository
  61. git clone git@github.com:apache/dolphinscheduler.git
  62. # Install PyDolphinScheduler in develop mode
  63. cd dolphinscheduler-python/pydolphinscheduler && python -m pip install -e .
  64. After you installed *PyDolphinScheduler*, please remember `start Python Gateway Service`_
  65. which waiting for *PyDolphinScheduler*'s workflow definition require.
  66. Start Python Gateway Service
  67. ----------------------------
  68. Since **PyDolphinScheduler** is Python API for `Apache DolphinScheduler`_, it
  69. could define workflow and tasks structure, but could not run it unless you
  70. `install Apache DolphinScheduler`_ and start its API server which including
  71. Python gateway service in it. We only and some key steps here and you could
  72. go `install Apache DolphinScheduler`_ for more detail
  73. .. code-block:: bash
  74. # Start DolphinScheduler api-server which including python gateway service
  75. ./bin/dolphinscheduler-daemon.sh start api-server
  76. To check whether the server is alive or not, you could run :code:`jps`. And
  77. the server is health if keyword `ApiApplicationServer` in the console.
  78. .. code-block:: bash
  79. jps
  80. # ....
  81. # 201472 ApiApplicationServer
  82. # ....
  83. .. note::
  84. Please make sure you already enabled started Python gateway service along with `api-server`. The configuration is in
  85. yaml config path `python-gateway.enabled : true` in api-server's configuration path in `api-server/conf/application.yaml`.
  86. The default value is true and Python gateway service start when api server is been started.
  87. Run an Example
  88. --------------
  89. Before run an example for pydolphinscheduler, you should get the example code from it source code. You could run
  90. single bash command to get it
  91. .. code-block:: bash
  92. wget https://raw.githubusercontent.com/apache/dolphinscheduler/dev/dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py
  93. or you could copy-paste the content from `tutorial source code`_. And then you could run the example in your
  94. terminal
  95. .. code-block:: bash
  96. python tutorial.py
  97. If you want to submit your workflow to a remote API server, which means that your workflow script is different
  98. from the API server, you should first change pydolphinscheduler configuration and then submit the workflow script
  99. .. code-block:: bash
  100. pydolphinscheduler config --init
  101. pydolphinscheduler config --set java_gateway.address <YOUR-API-SERVER-IP-OR-HOSTNAME>
  102. python tutorial.py
  103. .. note::
  104. You could see more information in :doc:`config` about all the configurations pydolphinscheduler supported.
  105. After that, you could go and see your DolphinScheduler web UI to find out a new workflow created by pydolphinscheduler,
  106. and the path of web UI is `Project -> Workflow -> Workflow Definition`.
  107. What's More
  108. -----------
  109. If you do not familiar with *PyDolphinScheduler*, you could go to :doc:`tutorial` and see how it works. But
  110. if you already know the basic usage or concept of *PyDolphinScheduler*, you could go and play with all
  111. :doc:`tasks/index` *PyDolphinScheduler* supports, or see our :doc:`howto/index` about useful cases.
  112. .. _`instructions for all platforms here`: https://wiki.python.org/moin/BeginnersGuide/Download
  113. .. _`Apache DolphinScheduler`: https://dolphinscheduler.apache.org
  114. .. _`install Apache DolphinScheduler`: https://dolphinscheduler.apache.org/en-us/docs/latest/user_doc/guide/installation/standalone.html
  115. .. _`tutorial source code`: https://raw.githubusercontent.com/apache/dolphinscheduler/dev/dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py