DolphinScheduler

Stone大约 12 分钟

DolphinScheduler

注意:

此文档对应的 DolphinScheduler 版本为 3.1.6。

简介

DolphinScheduler 是一款分布式、易扩展的开源工作流调度系统,支持多租户、高可用、灵活的定时任务和数据处理等功能,可应用于大数据生态系统中的离线计算、实时处理、机器学习等场景。详细信息参考官方文档open in new window

环境需求

操作系统

对操作系统的需求如下:

操作系统版本
Red Hat Enterprise Linux7.0 及以上
CentOS7.0 及以上
Oracle Enterprise Linux7.0 及以上
Ubuntu LTS16.04 及以上

服务器

对硬件的需求如下:

CPU内存硬盘类型网络实例数量
4核+8 GB+SAS千兆网卡1+

注意:

  • 以上建议配置为部署 DolphinScheduler 的最低配置,生产环境强烈推荐使用更高的配置
  • 硬盘大小配置建议 50GB+ ,系统盘和数据盘分开

网络

对网络端口需求如下:

组件默认端口说明
MasterServer5678非通信端口,只需本机端口不冲突即可
WorkerServer1234非通信端口,只需本机端口不冲突即可
ApiApplicationServer12345提供后端通信端口

部署

DolphinScheduler 有以下几种部署方式:

  • 单机部署(Standalone)
  • 伪集群部署(Pseudo-Cluster)
  • 集群部署(Cluster)
  • Kubernetes 部署

生产环境中一般使用集群部署(Cluster)或者 Kubernetes 部署。

集群部署

环境

使用 3 台主机来部署集群,主机信息如下:

主机名IP操作系统CPU内存角色
dsnode1192.168.44.140RHEL 7.94核8 GBMaster,Worker
dsnode2192.168.44.141RHEL 7.94核8 GBMaster,Worker
dsnode3192.168.44.142RHEL 7.94核8 GBWorker,Alert Server,Api Server
[root@dsnode1 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.9 (Maipo)
[root@dsnode1 ~]# grep 'processor' /proc/cpuinfo | uniq | wc -l
4
[root@dsnode1 ~]# cat /proc/meminfo | grep MemTotal
MemTotal:       7914804 kB

如未做特殊说明,以下操作均需要在集群所有主机上进行。

禁用 SELinux

修改文件 /etc/selinux/config,将 SELINUX=enforcing 修改为 SELINUX=disabled

[root@dsnode1 ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

修改完成后重启主机。

[root@dsnode1 ~]# init 6

重启后确认 SELinux 状态为 disabled

[root@dsnode1 ~]# sestatus 
SELinux status:                 disabled

关闭防火墙

关闭防火墙,禁止开机启动。

[root@dsnode1 ~]# systemctl stop firewalld.service
[root@dsnode1 ~]# systemctl disable firewalld.service
[root@dsnode1 ~]# systemctl status firewalld.service 
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

配置本地域名解析

在集群所有主机上,将集群中所有机器的主机名和 IP 地址写入到 /etc/hosts 文件。

[root@dsnode1 ~]# echo "192.168.44.140   dsnode1" >> /etc/hosts
[root@dsnode1 ~]# echo "192.168.44.141   dsnode2" >> /etc/hosts
[root@dsnode1 ~]# echo "192.168.44.142   dsnode3" >> /etc/hosts
[root@dsnode1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.44.140   dsnode1
192.168.44.141   dsnode2
192.168.44.142   dsnode3

配置时间同步

需要保持集群所有主机的时间同步。RHEL 6 使用 ntpd 配置时间同步,RHEL 7 使用 chronyd 配置时间同步。

[root@dsnode1 ~]# vi /etc/chrony.conf 
server time.stonecoding.net iburst
[root@dsnode1 ~]# systemctl restart chronyd.service
[root@dsnode1 ~]# chronyc makestep
200 OK
[root@dsnode1 ~]# chronyc sourcestats
210 Number of sources = 1
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
TIME.stonecoding.net              6   3   325     +9.811     25.043   +498us   871us

安装依赖

[root@dsnode1 ~]# yum install psmisc

安装 JDK

在集群所有主机上安装 JDK 并配置环境变量,参考:安装 JDKopen in new window

安装 ZooKeeper

安装配置 ZooKeeper 集群,参考:ZooKeeperopen in new window

注意:

ZooKeeper 的版本需要根据 DolphinScheduler 安装目录中对应的 jar 包版本确定。DolphinScheduler 3.1.6 版本对应的 jar 包为 zookeeper-3.8.0.jar,故 ZooKeeper 需安装 3.8.0。如果版本不匹配,则会出现如下报错:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli/DefaultParser
        at org.apache.zookeeper.cli.DeleteAllCommand.parse(DeleteAllCommand.java:52)
        at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:438)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:367)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:350)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:293)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli.DefaultParser

安装 MySQL

DolphinScheduler 需要使用数据库,支持 PostgreSQLopen in new window (8.2.15+) 或者 MySQLopen in new window (5.7+),这里使用 MySQL,安装参考: Installing MySQL on Linuxopen in new window

初始化数据库:

[(none)]> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected, 2 warnings (0.01 sec)

[(none)]> CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'Abcd@1234';
Query OK, 0 rows affected (0.03 sec)

[(none)]> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%';
Query OK, 0 rows affected (0.01 sec)

注意:

用户密码不要包含特殊字符 $,否则后面使用脚本初始化数据库时会报如下错误:

Caused by: java.sql.SQLException: Access denied for user 'dolphinscheduler'@'192.168.44.140' (using password: YES)

配置用户

在每个节点上创建专门的操作系统用户,用户名通常为 dolphinscheduler ,来管理 DolphinScheduler。

[root@dsnode1 ~]# groupadd -g 1100 dolphinscheduler
[root@dsnode1 ~]# useradd -g dolphinscheduler -u 1100 dolphinscheduler
[root@dsnode1 ~]# echo "123456" | passwd --stdin dolphinscheduler
[root@dsnode1 ~]# id dolphinscheduler
uid=1100(dolphinscheduler) gid=1100(dolphinscheduler) groups=1100(dolphinscheduler)

创建用户后,配置所有节点的 SSH 互信,互相之间可以通过 SSH 免密码访问。这里使用 Oraceopen in new windowsshUserSetup.sh 脚本来配置,只需要在其中一个节点运行即可。

[root@dsnode1 ~]# ./sshUserSetup.sh -user dolphinscheduler -hosts "dsnode1 dsnode2 dsnode3" -advanced -noPromptPassphrase

如果出现报错 Bad owner or permissions on /home/gpadmin/.ssh/config,原因是RHEL 7.9 对 config 文件的权限进行了调整,参考文档:Doc ID 2923516.1open in new window,修改脚本再次运行:

[root@dsnode1 ~]# grep -n "chmod 644" sshUserSetup.sh | grep config
450:chmod 644 $HOME/.ssh/config
496:     $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \"  mkdir -p .ssh ; chmod og-w . .ssh;   touch .ssh/authorized_keys .ssh/known_hosts;  chmod 644 .ssh/authorized_keys  .ssh/known_hosts; cp  .ssh/authorized_keys .ssh/authorized_keys.tmp ;  cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f  .sshconfig ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config\""  | tee -a $LOGFILE
572:chmod 644 $HOME/.ssh/config

[root@dsnode1 ~]# vi sshUserSetup.sh
[root@dsnode1 ~]# grep -n "chmod 600" sshUserSetup.sh
450:chmod 600 $HOME/.ssh/config
496:     $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \"  mkdir -p .ssh ; chmod og-w . .ssh;   touch .ssh/authorized_keys .ssh/known_hosts;  chmod 644 .ssh/authorized_keys  .ssh/known_hosts; cp  .ssh/authorized_keys .ssh/authorized_keys.tmp ;  cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f  .ssh/config ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config ; chmod 600 .ssh/config\""  | tee -a $LOGFILE
572:chmod 600 $HOME/.ssh/config

[root@dsnode1 ~]# ./sshUserSetup.sh -user dolphinscheduler -hosts "dsnode1 dsnode2 dsnode3" -advanced -noPromptPassphrase

在所有节点为用户 dolphinscheduler 配置 sudo

[root@dsnode1 ~]# visudo
## Allows people in group wheel to run all commands
%wheel  ALL=(ALL)       ALL

## Same thing without a password
%wheel  ALL=(ALL)       NOPASSWD: ALL
[root@dsnode1 ~]# usermod -aG wheel dolphinscheduler

在所有节点切换到用户 dolphinscheduler 配置 Java 环境变量:

[root@dsnode1 ~]# su - dolphinscheduler 
[dolphinscheduler@dsnode1 ~]$ vi .bash_profile 
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH

export JAVA_HOME=/usr/local/jdk1.8.0_201
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

[dolphinscheduler@dsnode1 ~]$ source .bash_profile
[dolphinscheduler@dsnode1 ~]$ java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)

安装 DolphinScheduler

官方网站open in new window 下载最新稳定版,下载的安装包为:apache-dolphinscheduler-3.1.6-bin.tar.gzopen in new window

在 Master 节点使用 dolphinscheduler 用户解压安装包:

[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ tar -xvzf apache-dolphinscheduler-3.1.6-bin.tar.gz

下载 mysql-connector-java-8.0.16.jaropen in new window 驱动,拷贝到对应目录:

[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/alert-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/api-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/master-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/worker-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/tools/libs/

配置数据库信息:

[dolphinscheduler@dsnode1 ~]$ vi apache-dolphinscheduler-3.1.6-bin/bin/env/dolphinscheduler_env.sh
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=${JAVA_HOME:-/usr/local/jdk1.8.0_201}

# Database related configuration, set database type, username and password
export DATABASE="mysql"
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:mysql://172.30.60.14:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&useSSL=false"
export SPRING_DATASOURCE_USERNAME=dolphinscheduler
export SPRING_DATASOURCE_PASSWORD=Abcd@1234

# DolphinScheduler server related configuration
export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-UTC}
export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}

# Registry center configuration, determines the type and link of the registry center
export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-zknode1:2181,zknode2:2181,zknode3:2181}

# Tasks related configurations, need to change the configuration if you use the related tasks.
export HADOOP_HOME=${HADOOP_HOME:-/opt/soft/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/opt/soft/hadoop/etc/hadoop}
export SPARK_HOME1=${SPARK_HOME1:-/opt/soft/spark1}
export SPARK_HOME2=${SPARK_HOME2:-/opt/soft/spark2}
export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python}
export HIVE_HOME=${HIVE_HOME:-/opt/soft/hive}
export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}
export SEATUNNEL_HOME=${SEATUNNEL_HOME:-/opt/soft/seatunnel}
export CHUNJUN_HOME=${CHUNJUN_HOME:-/opt/soft/chunjun}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$SEATUNNEL_HOME/bin:$CHUNJUN_HOME/bin:$PATH

运行脚本初始化数据库:

[dolphinscheduler@dsnode1 ~]$ bash apache-dolphinscheduler-3.1.6-bin/tools/bin/upgrade-schema.sh 

修改安装配置文件:

[dolphinscheduler@dsnode1 ~]$ vi apache-dolphinscheduler-3.1.6-bin/bin/env/install_env.sh
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# ---------------------------------------------------------
# INSTALL MACHINE
# ---------------------------------------------------------
# A comma separated list of machine hostname or IP would be installed DolphinScheduler,
# including master, worker, api, alert. If you want to deploy in pseudo-distributed
# mode, just write a pseudo-distributed hostname
# Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5"
ips="dsnode1,dsnode2,dsnode3"

# Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
# modify it if you use different ssh port
sshPort="22"

# A comma separated list of machine hostname or IP would be installed Master server, it
# must be a subset of configuration `ips`.
# Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2"
masters="dsnode1,dsnode2"

# A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a
# subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
# Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
workers="dsnode1:default,dsnode2:default,dsnode3:default"

# A comma separated list of machine hostname or IP would be installed Alert server, it
# must be a subset of configuration `ips`.
# Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3"
alertServer="dsnode3"

# A comma separated list of machine hostname or IP would be installed API server, it
# must be a subset of configuration `ips`.
# Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1"
apiServers="dsnode3"

# The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists.
# Do not set this configuration same as the current path (pwd). Do not add quotes to it if you using related path.
installPath="/home/dolphinscheduler/dolphinscheduler"

# The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh`
# script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs
# to be created by this user
deployUser="dolphinscheduler"

# The root of zookeeper, for now DolphinScheduler default registry server is zookeeper.
zkRoot=${zkRoot:-"/dolphinscheduler"}

运行脚本安装 DolphinScheduler,会将程序文件拷贝到所有节点的 /home/dolphinscheduler/dolphinscheduler 目录:

[dolphinscheduler@dsnode1 ~]$ bash apache-dolphinscheduler-3.1.6-bin/bin/install.sh

停止所有服务:

[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/stop-all.sh 

启动所有服务:

[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/start-all.sh 

查看服务状态:

[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/status-all.sh


====================== dolphinscheduler server config =============================
1.dolphinscheduler server node config hosts:[  dsnode1,dsnode2,dsnode3  ]
2.master server node config hosts:[  dsnode1,dsnode2  ]
3.worker server node config hosts:[  dsnode1:default,dsnode2:default,dsnode3:default  ]
4.alert server node config hosts:[  dsnode3  ]
5.api server node config hosts:[  dsnode3  ]


====================== dolphinscheduler server status =============================
node server state


dsnode1  Begin status master-server......
master-server  [  RUNNING  ]
End status master-server.
dsnode2  Begin status master-server......
master-server  [  RUNNING  ]
End status master-server.
dsnode1  Begin status worker-server......
worker-server  [  RUNNING  ]
End status worker-server.
dsnode2  Begin status worker-server......
worker-server  [  RUNNING  ]
End status worker-server.
dsnode3  Begin status worker-server......
worker-server  [  RUNNING  ]
End status worker-server.
dsnode3  Begin status alert-server......
alert-server  [  RUNNING  ]
End status alert-server.
dsnode3  Begin status api-server......
api-server  [  RUNNING  ]
End status api-server.

登录 DolphinScheduler

正常启动后,浏览器访问地址 http://192.168.44.142:12345/dolphinscheduler/uiopen in new window (Api Server 所在主机)进入登录页面。默认的用户名为:admin,密码为:dolphinscheduler123。

image-20230520181531052

image-20230520181649501

功能

任务

DataX

官方网站open in new window 下载 DataXopen in new window 安装包到 Worker 节点的 dolphinscheduler 用户主目录下。在配置文件 /home/dolphinscheduler/dolphinscheduler/bin/env/dolphinscheduler_env.sh 中指定了 /opt/soft/datax 为 DataX 默认目录,将 DataX 安装包解压为该目录,然后将其拷贝到其他 Worker 节点。

[root@dsnode1 ~]# mkdir /opt/soft
[root@dsnode1 ~]# chown dolphinscheduler:dolphinscheduler /opt/soft
[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ tar -xvzf datax.tar.gz -C /opt/soft/
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/datax/ dolphinscheduler@dsnode2:/opt/soft/
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/datax/ dolphinscheduler@dsnode3:/opt/soft/

在每个节点创建 Python 的软连接:

[root@bdatnode1 ~]# python -V
Python 2.7.5
[dolphinscheduler@bdatnode1 ~]$ mkdir -p /opt/soft/python/bin/
[dolphinscheduler@bdatnode1 ~]$ ln -s /bin/python /opt/soft/python/bin/python2.7
[dolphinscheduler@bdatnode1 ~]$ ll /opt/soft/python/bin/
total 0
lrwxrwxrwx 1 dolphinscheduler dolphinscheduler 11 May 23 15:04 python2.7 -> /bin/python

阿里巴巴官方的 DataX 没有专门针对 Greenplum 的接口组件,只能使用 PostgreSQL 的接口组件,性能较差。可以使用其他包含 gpdbwriter 接口组件的 DataX 版本,例如:HashDataInc/DataXopen in new window 或者 BoomLee/DataXopen in new window

注意:

配置到 Oracle 的连接时,如果使用的是服务命,则 jdbcUrl 的格式为(即端口和服务名之间为斜杠):

"jdbc:oracle:thin:@192.168.44.135:1521/oratest"

例子:配置从 Oracle 读取数据写入到 Greenplum

{
    "job": {
        "setting": {
            "speed": {
                "channel": 3
            }, 
            "errorLimit": {
                "record": 2, 
                "percentage": 0.02
            }
        }, 
        "content": [
            {
                "reader": {
                    "name": "oraclereader", 
                    "parameter": {
                        "username": "orcl", 
                        "password": "123456", 
                        "column": ["*"], 
                        "splitPk": "id", 
                        "connection": [
                            {
                                "table": [
                                    "orcl.testtab1"
                                ], 
                                "jdbcUrl": [
                                    "jdbc:oracle:thin:@192.168.44.135:1521/oratest"
                                ]
                            }
                        ]
                    }
                }, 
                "writer": {
                    "name": "postgresqlwriter", 
                    "parameter": {
                        "username": "tester", 
                        "password": "123456", 
                        "column": ["*"],              
                        "connection": [
                            {
                                "jdbcUrl": "jdbc:postgresql://192.168.44.160:5432/pgtest", 
                                "table": [
                                    "test.testtab1"
                                ]
                            }
                        ]
                    }
                }
            }
        ]
    }
}

SeaTunnel

官方网站open in new window 下载 SeaTunnelopen in new window 安装包到 Worker 节点的 dolphinscheduler 用户主目录下。在配置文件 /home/dolphinscheduler/dolphinscheduler/bin/env/dolphinscheduler_env.sh 中指定了 /opt/soft/seatunnel 为 SeaTunnel 默认目录,将 SeaTunnel 安装包解压为该目录,然后将其拷贝到其他 Worker 节点。

[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ tar -xvzf apache-seatunnel-incubating-2.3.1-bin.tar.gz -C /opt/soft/
[dolphinscheduler@dsnode1 ~]$ mv /opt/soft/apache-seatunnel-incubating-2.3.1/ /opt/soft/seatunnel
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/seatunnel/ dolphinscheduler@dsnode2:/opt/soft/
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/seatunnel/ dolphinscheduler@dsnode3:/opt/soft/

数据源

Oracle

进入 Oracle 官方网站 JDBC 驱动下载页面open in new window,下载对应 Oracle 版本及 JDK 版本的驱动,如果 Oracle 版本为 11g,则官方网站已经不提供该版本的驱动了,可以去 Mavenopen in new window 仓库下载。这里下载的是 ojdbc6-11.2.0.4.jaropen in new window

将驱动放到对应节点的 api-server/libs 以及 worker-server/libs 目录下。

然后重启 DolphinScheduler:

[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/stop-all.sh 
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/start-all.sh 

就可以登录到 DolphinScheduler 创建到 Oracle 数据库的数据源了。

PostgreSQL

DolphinScheduler 默认安装了 PostgreSQL 驱动,可用于 PostgreSQL 及 Greenplumopen in new window。如果缺少驱动,进入 PostgreSQL 官方网站 JDBC 驱动下载页面open in new window,下载对应 JDK 版本的驱动。

将驱动放到对应节点的 api-server/libs 以及 worker-server/libs 目录下。

然后重启 DolphinScheduler:

[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/stop-all.sh 
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/start-all.sh 

就可以登录到 DolphinScheduler 创建到 PostgreSQL 数据库的数据源了。

上次编辑于:
贡献者: stonebox,stone