start.rst 6.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154
  1. .. Licensed to the Apache Software Foundation (ASF) under one
  2. or more contributor license agreements. See the NOTICE file
  3. distributed with this work for additional information
  4. regarding copyright ownership. The ASF licenses this file
  5. to you under the Apache License, Version 2.0 (the
  6. "License"); you may not use this file except in compliance
  7. with the License. You may obtain a copy of the License at
  8. .. http://www.apache.org/licenses/LICENSE-2.0
  9. .. Unless required by applicable law or agreed to in writing,
  10. software distributed under the License is distributed on an
  11. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  12. KIND, either express or implied. See the License for the
  13. specific language governing permissions and limitations
  14. under the License.
  15. Getting Started
  16. ===============
  17. To get started with *PyDolphinScheduler* you must ensure python and pip
  18. installed on your machine, if you're already set up, you can skip straight
  19. to `Installing PyDolphinScheduler`_, otherwise please continue with
  20. `Installing Python`_.
  21. Installing Python
  22. -----------------
  23. How to install `python` and `pip` depends on what operating system
  24. you're using. The python wiki provides up to date
  25. `instructions for all platforms here`_. When you entering the website
  26. and choice your operating system, you would be offered the choice and
  27. select python version. *PyDolphinScheduler* recommend use version above
  28. Python 3.6 and we highly recommend you install *Stable Releases* instead
  29. of *Pre-releases*.
  30. After you have download and installed Python, you should open your terminal,
  31. typing and running :code:`python --version` to check whether the installation
  32. is correct or not. If all thing good, you could see the version in console
  33. without error(here is a example after Python 3.8.7 installed)
  34. .. code-block:: bash
  35. $ python --version
  36. Python 3.8.7
  37. Installing PyDolphinScheduler
  38. -----------------------------
  39. After Python is already installed on your machine following section
  40. `installing Python`_, it easy to *PyDolphinScheduler* by pip.
  41. .. code-block:: bash
  42. $ pip install apache-dolphinscheduler
  43. The latest version of *PyDolphinScheduler* would be installed after you run above
  44. command in your terminal. You could go and `start Python Gateway Service`_ to finish
  45. the prepare, and then go to :doc:`tutorial` to make your hand dirty. But if you
  46. want to install the unreleased version of *PyDolphinScheduler*, you could go and see
  47. section `installing PyDolphinScheduler in dev`_ for more detail.
  48. Installing PyDolphinScheduler In Dev
  49. ------------------------------------
  50. Because the project is developing and some of the features still not release.
  51. If you want to try some thing unreleased you could install from the source code
  52. which we hold in GitHub
  53. .. code-block:: bash
  54. # Clone Apache DolphinScheduler repository
  55. $ git clone git@github.com:apache/dolphinscheduler.git
  56. # Install PyDolphinScheduler in develop mode
  57. $ cd dolphinscheduler-python/pydolphinscheduler && pip install -e .
  58. After you installed *PyDolphinScheduler*, please remember `start Python Gateway Service`_
  59. which waiting for *PyDolphinScheduler*'s workflow definition require.
  60. Start Python Gateway Service
  61. ----------------------------
  62. Since **PyDolphinScheduler** is Python API for `Apache DolphinScheduler`_, it
  63. could define workflow and tasks structure, but could not run it unless you
  64. `install Apache DolphinScheduler`_ and start its API server which including
  65. Python gateway service in it. We only and some key steps here and you could
  66. go `install Apache DolphinScheduler`_ for more detail
  67. .. code-block:: bash
  68. # Start DolphinScheduler api-server which including python gateway service
  69. $ ./bin/dolphinscheduler-daemon.sh start api-server
  70. To check whether the server is alive or not, you could run :code:`jps`. And
  71. the server is health if keyword `ApiApplicationServer` in the console.
  72. .. code-block:: bash
  73. $ jps
  74. ....
  75. 201472 ApiApplicationServer
  76. ....
  77. .. note::
  78. Please make sure you already enabled started Python gateway service along with `api-server`. The configuration is in
  79. yaml config path `python-gateway.enabled : true` in api-server's configuration path in `api-server/conf/application.yaml`.
  80. The default value is true and Python gateway service start when api server is been started.
  81. Run an Example
  82. --------------
  83. Before run an example for pydolphinscheduler, you should get the example code from it source code. You could run
  84. single bash command to get it
  85. .. code-block:: bash
  86. $ wget https://raw.githubusercontent.com/apache/dolphinscheduler/dev/dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py
  87. or you could copy-paste the content from `tutorial source code`_. And then you could run the example in your
  88. terminal
  89. .. code-block:: bash
  90. $ python tutorial.py
  91. If you want to submit your workflow to a remote API server, which means that your workflow script is different
  92. from the API server, you should first change pydolphinscheduler configuration and then submit the workflow script
  93. .. code-block:: bash
  94. $ pydolphinscheduler config --init
  95. $ pydolphinscheduler config --set java_gateway.address <your-api-server-ip-or-hostname>
  96. $ python tutorial.py
  97. .. note::
  98. You could see more information in :doc:`config` about all the configurations pydolphinscheduler supported.
  99. After that, you could go and see your DolphinScheduler web UI to find out a new workflow created by pydolphinscheduler,
  100. and the path of web UI is `Project -> Workflow -> Workflow Definition`.
  101. What's More
  102. -----------
  103. If you do not familiar with *PyDolphinScheduler*, you could go to :doc:`tutorial` and see how it works. But
  104. if you already know the basic usage or concept of *PyDolphinScheduler*, you could go and play with all
  105. :doc:`tasks/index` *PyDolphinScheduler* supports, or see our :doc:`howto/index` about useful cases.
  106. .. _`instructions for all platforms here`: https://wiki.python.org/moin/BeginnersGuide/Download
  107. .. _`Apache DolphinScheduler`: https://dolphinscheduler.apache.org
  108. .. _`install Apache DolphinScheduler`: https://dolphinscheduler.apache.org/en-us/docs/latest/user_doc/guide/installation/standalone.html
  109. .. _`tutorial source code`: https://raw.githubusercontent.com/apache/dolphinscheduler/dev/dolphinscheduler-python/pydolphinscheduler/src/pydolphinscheduler/examples/tutorial.py