start.rst 7.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171
  1. .. Licensed to the Apache Software Foundation (ASF) under one
  2. or more contributor license agreements. See the NOTICE file
  3. distributed with this work for additional information
  4. regarding copyright ownership. The ASF licenses this file
  5. to you under the Apache License, Version 2.0 (the
  6. "License"); you may not use this file except in compliance
  7. with the License. You may obtain a copy of the License at
  8. .. http://www.apache.org/licenses/LICENSE-2.0
  9. .. Unless required by applicable law or agreed to in writing,
  10. software distributed under the License is distributed on an
  11. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  12. KIND, either express or implied. See the License for the
  13. specific language governing permissions and limitations
  14. under the License.
  15. Getting Started
  16. ===============
  17. To get started with *PyDolphinScheduler* you must ensure python and pip
  18. installed on your machine, if you're already set up, you can skip straight
  19. to `Installing PyDolphinScheduler`_, otherwise please continue with
  20. `Installing Python`_.
  21. Installing Python
  22. -----------------
  23. How to install `python` and `pip` depends on what operating system
  24. you're using. The python wiki provides up to date
  25. `instructions for all platforms here`_. When you entering the website
  26. and choice your operating system, you would be offered the choice and
  27. select python version. *PyDolphinScheduler* recommend use version above
  28. Python 3.6 and we highly recommend you install *Stable Releases* instead
  29. of *Pre-releases*.
  30. After you have download and installed Python, you should open your terminal,
  31. typing and running :code:`python --version` to check whether the installation
  32. is correct or not. If all thing good, you could see the version in console
  33. without error(here is a example after Python 3.8.7 installed)
  34. .. code-block:: bash
  35. python --version
  36. Will see detail of Python version, such as *Python 3.8.7*
  37. Installing PyDolphinScheduler
  38. -----------------------------
  39. After Python is already installed on your machine following section
  40. `installing Python`_, it easy to *PyDolphinScheduler* by pip.
  41. .. code-block:: bash
  42. python -m pip install apache-dolphinscheduler
  43. The latest version of *PyDolphinScheduler* would be installed after you run above
  44. command in your terminal. You could go and `start Python Gateway Service`_ to finish
  45. the prepare, and then go to :doc:`tutorial` to make your hand dirty. But if you
  46. want to install the unreleased version of *PyDolphinScheduler*, you could go and see
  47. section `installing PyDolphinScheduler in dev branch`_ for more detail.
  48. .. note::
  49. Currently, we released multiple pre-release package in PyPI, you can see all released package
  50. including pre-release in `release history <https://pypi.org/project/apache-dolphinscheduler/#history>`_.
  51. You can fix the the package version if you want to install pre-release package, for example if
  52. you want to install version `3.0.0-beta-2` package, you can run command
  53. :code:`python -m pip install apache-dolphinscheduler==3.0.0b2`.
  54. Installing PyDolphinScheduler In DEV Branch
  55. -------------------------------------------
  56. Because the project is developing and some of the features still not release.
  57. If you want to try some thing unreleased you could install from the source code
  58. which we hold in GitHub
  59. .. code-block:: bash
  60. # Clone Apache DolphinScheduler repository
  61. git clone git@github.com:apache/dolphinscheduler.git
  62. # Install PyDolphinScheduler in develop mode
  63. cd dolphinscheduler-python/pydolphinscheduler && python -m pip install -e .
  64. After you installed *PyDolphinScheduler*, please remember `start Python Gateway Service`_
  65. which waiting for *PyDolphinScheduler*'s workflow definition require.
  66. Above command will clone whole dolphinscheduler source code to local, maybe you want to install latest pydolphinscheduler
  67. package directly and do not care about other code(including Python gateway service code), you can execute command
  68. .. code-block:: bash
  69. # Must escape the '&' character by adding '\'
  70. pip install -e "git+https://github.com/apache/dolphinscheduler.git#egg=apache-dolphinscheduler&subdirectory=dolphinscheduler-python/pydolphinscheduler"
  71. Start Python Gateway Service
  72. ----------------------------
  73. Since **PyDolphinScheduler** is Python API for `Apache DolphinScheduler`_, it
  74. could define workflow and tasks structure, but could not run it unless you
  75. `install Apache DolphinScheduler`_ and start its API server which including
  76. Python gateway service in it. We only and some key steps here and you could
  77. go `install Apache DolphinScheduler`_ for more detail
  78. .. code-block:: bash
  79. # Start DolphinScheduler api-server which including python gateway service
  80. ./bin/dolphinscheduler-daemon.sh start api-server
  81. To check whether the server is alive or not, you could run :code:`jps`. And
  82. the server is health if keyword `ApiApplicationServer` in the console.
  83. .. code-block:: bash
  84. jps
  85. # ....
  86. # 201472 ApiApplicationServer
  87. # ....
  88. .. note::
  89. Please make sure you already enabled started Python gateway service along with `api-server`. The configuration is in
  90. yaml config path `python-gateway.enabled : true` in api-server's configuration path in `api-server/conf/application.yaml`.
  91. The default value is true and Python gateway service start when api server is been started.
  92. Run an Example
  93. --------------
  94. Before run an example for pydolphinscheduler, you should get the example code from it source code. You could run
  95. single bash command to get it
  96. .. code-block:: bash
  97. wget https://raw.githubusercontent.com/apache/dolphinscheduler/dev/dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py
  98. or you could copy-paste the content from `tutorial source code`_. And then you could run the example in your
  99. terminal
  100. .. code-block:: bash
  101. python tutorial.py
  102. If you want to submit your workflow to a remote API server, which means that your workflow script is different
  103. from the API server, you should first change pydolphinscheduler configuration and then submit the workflow script
  104. .. code-block:: bash
  105. pydolphinscheduler config --init
  106. pydolphinscheduler config --set java_gateway.address <YOUR-API-SERVER-IP-OR-HOSTNAME>
  107. python tutorial.py
  108. .. note::
  109. You could see more information in :doc:`config` about all the configurations pydolphinscheduler supported.
  110. After that, you could go and see your DolphinScheduler web UI to find out a new workflow created by pydolphinscheduler,
  111. and the path of web UI is `Project -> Workflow -> Workflow Definition`.
  112. What's More
  113. -----------
  114. If you do not familiar with *PyDolphinScheduler*, you could go to :doc:`tutorial` and see how it works. But
  115. if you already know the basic usage or concept of *PyDolphinScheduler*, you could go and play with all
  116. :doc:`tasks/index` *PyDolphinScheduler* supports, or see our :doc:`howto/index` about useful cases.
  117. .. _`instructions for all platforms here`: https://wiki.python.org/moin/BeginnersGuide/Download
  118. .. _`Apache DolphinScheduler`: https://dolphinscheduler.apache.org
  119. .. _`install Apache DolphinScheduler`: https://dolphinscheduler.apache.org/en-us/docs/latest/user_doc/guide/installation/standalone.html
  120. .. _`tutorial source code`: https://raw.githubusercontent.com/apache/dolphinscheduler/dev/dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py