DolphinScheduler
DolphinScheduler
注意:
此文档对应的 DolphinScheduler 版本为 3.1.6。
简介
DolphinScheduler 是一款分布式、易扩展的开源工作流调度系统,支持多租户、高可用、灵活的定时任务和数据处理等功能,可应用于大数据生态系统中的离线计算、实时处理、机器学习等场景。详细信息参考官方文档。
环境需求
操作系统
对操作系统的需求如下:
操作系统 | 版本 |
---|---|
Red Hat Enterprise Linux | 7.0 及以上 |
CentOS | 7.0 及以上 |
Oracle Enterprise Linux | 7.0 及以上 |
Ubuntu LTS | 16.04 及以上 |
服务器
对硬件的需求如下:
CPU | 内存 | 硬盘类型 | 网络 | 实例数量 |
---|---|---|---|---|
4核+ | 8 GB+ | SAS | 千兆网卡 | 1+ |
注意:
- 以上建议配置为部署 DolphinScheduler 的最低配置,生产环境强烈推荐使用更高的配置
- 硬盘大小配置建议 50GB+ ,系统盘和数据盘分开
网络
对网络端口需求如下:
组件 | 默认端口 | 说明 |
---|---|---|
MasterServer | 5678 | 非通信端口,只需本机端口不冲突即可 |
WorkerServer | 1234 | 非通信端口,只需本机端口不冲突即可 |
ApiApplicationServer | 12345 | 提供后端通信端口 |
部署
DolphinScheduler 有以下几种部署方式:
- 单机部署(Standalone)
- 伪集群部署(Pseudo-Cluster)
- 集群部署(Cluster)
- Kubernetes 部署
生产环境中一般使用集群部署(Cluster)或者 Kubernetes 部署。
集群部署
环境
使用 3 台主机来部署集群,主机信息如下:
主机名 | IP | 操作系统 | CPU | 内存 | 角色 |
---|---|---|---|---|---|
dsnode1 | 192.168.44.140 | RHEL 7.9 | 4核 | 8 GB | Master,Worker |
dsnode2 | 192.168.44.141 | RHEL 7.9 | 4核 | 8 GB | Master,Worker |
dsnode3 | 192.168.44.142 | RHEL 7.9 | 4核 | 8 GB | Worker,Alert Server,Api Server |
[root@dsnode1 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.9 (Maipo)
[root@dsnode1 ~]# grep 'processor' /proc/cpuinfo | uniq | wc -l
4
[root@dsnode1 ~]# cat /proc/meminfo | grep MemTotal
MemTotal: 7914804 kB
如未做特殊说明,以下操作均需要在集群所有主机上进行。
禁用 SELinux
修改文件 /etc/selinux/config
,将 SELINUX=enforcing
修改为 SELINUX=disabled
。
[root@dsnode1 ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
修改完成后重启主机。
[root@dsnode1 ~]# init 6
重启后确认 SELinux 状态为 disabled
。
[root@dsnode1 ~]# sestatus
SELinux status: disabled
关闭防火墙
关闭防火墙,禁止开机启动。
[root@dsnode1 ~]# systemctl stop firewalld.service
[root@dsnode1 ~]# systemctl disable firewalld.service
[root@dsnode1 ~]# systemctl status firewalld.service
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
配置本地域名解析
在集群所有主机上,将集群中所有机器的主机名和 IP 地址写入到 /etc/hosts
文件。
[root@dsnode1 ~]# echo "192.168.44.140 dsnode1" >> /etc/hosts
[root@dsnode1 ~]# echo "192.168.44.141 dsnode2" >> /etc/hosts
[root@dsnode1 ~]# echo "192.168.44.142 dsnode3" >> /etc/hosts
[root@dsnode1 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.44.140 dsnode1
192.168.44.141 dsnode2
192.168.44.142 dsnode3
配置时间同步
需要保持集群所有主机的时间同步。RHEL 6 使用 ntpd
配置时间同步,RHEL 7 使用 chronyd
配置时间同步。
[root@dsnode1 ~]# vi /etc/chrony.conf
server time.stonecoding.net iburst
[root@dsnode1 ~]# systemctl restart chronyd.service
[root@dsnode1 ~]# chronyc makestep
200 OK
[root@dsnode1 ~]# chronyc sourcestats
210 Number of sources = 1
Name/IP Address NP NR Span Frequency Freq Skew Offset Std Dev
==============================================================================
TIME.stonecoding.net 6 3 325 +9.811 25.043 +498us 871us
安装依赖
[root@dsnode1 ~]# yum install psmisc
安装 JDK
在集群所有主机上安装 JDK 并配置环境变量,参考:安装 JDK。
安装 ZooKeeper
安装配置 ZooKeeper 集群,参考:ZooKeeper。
注意:
ZooKeeper 的版本需要根据 DolphinScheduler 安装目录中对应的
jar
包版本确定。DolphinScheduler 3.1.6 版本对应的jar
包为zookeeper-3.8.0.jar
,故 ZooKeeper 需安装 3.8.0。如果版本不匹配,则会出现如下报错:Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli/DefaultParser at org.apache.zookeeper.cli.DeleteAllCommand.parse(DeleteAllCommand.java:52) at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:438) at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:367) at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:350) at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:293) Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli.DefaultParser
安装 MySQL
DolphinScheduler 需要使用数据库,支持 PostgreSQL (8.2.15+) 或者 MySQL (5.7+),这里使用 MySQL,安装参考: Installing MySQL on Linux。
初始化数据库:
[(none)]> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Query OK, 1 row affected, 2 warnings (0.01 sec)
[(none)]> CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'Abcd@1234';
Query OK, 0 rows affected (0.03 sec)
[(none)]> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%';
Query OK, 0 rows affected (0.01 sec)
注意:
用户密码不要包含特殊字符
$
,否则后面使用脚本初始化数据库时会报如下错误:Caused by: java.sql.SQLException: Access denied for user 'dolphinscheduler'@'192.168.44.140' (using password: YES)
配置用户
在每个节点上创建专门的操作系统用户,用户名通常为 dolphinscheduler
,来管理 DolphinScheduler。
[root@dsnode1 ~]# groupadd -g 1100 dolphinscheduler
[root@dsnode1 ~]# useradd -g dolphinscheduler -u 1100 dolphinscheduler
[root@dsnode1 ~]# echo "123456" | passwd --stdin dolphinscheduler
[root@dsnode1 ~]# id dolphinscheduler
uid=1100(dolphinscheduler) gid=1100(dolphinscheduler) groups=1100(dolphinscheduler)
创建用户后,配置所有节点的 SSH 互信,互相之间可以通过 SSH 免密码访问。这里使用 Orace 的 sshUserSetup.sh
脚本来配置,只需要在其中一个节点运行即可。
[root@dsnode1 ~]# ./sshUserSetup.sh -user dolphinscheduler -hosts "dsnode1 dsnode2 dsnode3" -advanced -noPromptPassphrase
如果出现报错 Bad owner or permissions on /home/gpadmin/.ssh/config
,原因是RHEL 7.9 对 config
文件的权限进行了调整,参考文档:Doc ID 2923516.1,修改脚本再次运行:
[root@dsnode1 ~]# grep -n "chmod 644" sshUserSetup.sh | grep config
450:chmod 644 $HOME/.ssh/config
496: $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \" mkdir -p .ssh ; chmod og-w . .ssh; touch .ssh/authorized_keys .ssh/known_hosts; chmod 644 .ssh/authorized_keys .ssh/known_hosts; cp .ssh/authorized_keys .ssh/authorized_keys.tmp ; cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f .sshconfig ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config\"" | tee -a $LOGFILE
572:chmod 644 $HOME/.ssh/config
[root@dsnode1 ~]# vi sshUserSetup.sh
[root@dsnode1 ~]# grep -n "chmod 600" sshUserSetup.sh
450:chmod 600 $HOME/.ssh/config
496: $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \" mkdir -p .ssh ; chmod og-w . .ssh; touch .ssh/authorized_keys .ssh/known_hosts; chmod 644 .ssh/authorized_keys .ssh/known_hosts; cp .ssh/authorized_keys .ssh/authorized_keys.tmp ; cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f .ssh/config ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config ; chmod 600 .ssh/config\"" | tee -a $LOGFILE
572:chmod 600 $HOME/.ssh/config
[root@dsnode1 ~]# ./sshUserSetup.sh -user dolphinscheduler -hosts "dsnode1 dsnode2 dsnode3" -advanced -noPromptPassphrase
在所有节点为用户 dolphinscheduler
配置 sudo
:
[root@dsnode1 ~]# visudo
## Allows people in group wheel to run all commands
%wheel ALL=(ALL) ALL
## Same thing without a password
%wheel ALL=(ALL) NOPASSWD: ALL
[root@dsnode1 ~]# usermod -aG wheel dolphinscheduler
在所有节点切换到用户 dolphinscheduler
配置 Java 环境变量:
[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
export JAVA_HOME=/usr/local/jdk1.8.0_201
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
[dolphinscheduler@dsnode1 ~]$ source .bash_profile
[dolphinscheduler@dsnode1 ~]$ java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
安装 DolphinScheduler
在 官方网站 下载最新稳定版,下载的安装包为:apache-dolphinscheduler-3.1.6-bin.tar.gz。
在 Master 节点使用 dolphinscheduler
用户解压安装包:
[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ tar -xvzf apache-dolphinscheduler-3.1.6-bin.tar.gz
下载 mysql-connector-java-8.0.16.jar 驱动,拷贝到对应目录:
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/alert-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/api-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/master-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/worker-server/libs/
[dolphinscheduler@dsnode1 ~]$ cp mysql-connector-java-8.0.16.jar apache-dolphinscheduler-3.1.6-bin/tools/libs/
配置数据库信息:
[dolphinscheduler@dsnode1 ~]$ vi apache-dolphinscheduler-3.1.6-bin/bin/env/dolphinscheduler_env.sh
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=${JAVA_HOME:-/usr/local/jdk1.8.0_201}
# Database related configuration, set database type, username and password
export DATABASE="mysql"
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:mysql://172.30.60.14:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&useSSL=false"
export SPRING_DATASOURCE_USERNAME=dolphinscheduler
export SPRING_DATASOURCE_PASSWORD=Abcd@1234
# DolphinScheduler server related configuration
export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-UTC}
export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}
# Registry center configuration, determines the type and link of the registry center
export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-zknode1:2181,zknode2:2181,zknode3:2181}
# Tasks related configurations, need to change the configuration if you use the related tasks.
export HADOOP_HOME=${HADOOP_HOME:-/opt/soft/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/opt/soft/hadoop/etc/hadoop}
export SPARK_HOME1=${SPARK_HOME1:-/opt/soft/spark1}
export SPARK_HOME2=${SPARK_HOME2:-/opt/soft/spark2}
export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python}
export HIVE_HOME=${HIVE_HOME:-/opt/soft/hive}
export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}
export SEATUNNEL_HOME=${SEATUNNEL_HOME:-/opt/soft/seatunnel}
export CHUNJUN_HOME=${CHUNJUN_HOME:-/opt/soft/chunjun}
export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$SEATUNNEL_HOME/bin:$CHUNJUN_HOME/bin:$PATH
运行脚本初始化数据库:
[dolphinscheduler@dsnode1 ~]$ bash apache-dolphinscheduler-3.1.6-bin/tools/bin/upgrade-schema.sh
修改安装配置文件:
[dolphinscheduler@dsnode1 ~]$ vi apache-dolphinscheduler-3.1.6-bin/bin/env/install_env.sh
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# ---------------------------------------------------------
# INSTALL MACHINE
# ---------------------------------------------------------
# A comma separated list of machine hostname or IP would be installed DolphinScheduler,
# including master, worker, api, alert. If you want to deploy in pseudo-distributed
# mode, just write a pseudo-distributed hostname
# Example for hostnames: ips="ds1,ds2,ds3,ds4,ds5", Example for IPs: ips="192.168.8.1,192.168.8.2,192.168.8.3,192.168.8.4,192.168.8.5"
ips="dsnode1,dsnode2,dsnode3"
# Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
# modify it if you use different ssh port
sshPort="22"
# A comma separated list of machine hostname or IP would be installed Master server, it
# must be a subset of configuration `ips`.
# Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2"
masters="dsnode1,dsnode2"
# A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a
# subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
# Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
workers="dsnode1:default,dsnode2:default,dsnode3:default"
# A comma separated list of machine hostname or IP would be installed Alert server, it
# must be a subset of configuration `ips`.
# Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3"
alertServer="dsnode3"
# A comma separated list of machine hostname or IP would be installed API server, it
# must be a subset of configuration `ips`.
# Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1"
apiServers="dsnode3"
# The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists.
# Do not set this configuration same as the current path (pwd). Do not add quotes to it if you using related path.
installPath="/home/dolphinscheduler/dolphinscheduler"
# The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh`
# script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs
# to be created by this user
deployUser="dolphinscheduler"
# The root of zookeeper, for now DolphinScheduler default registry server is zookeeper.
zkRoot=${zkRoot:-"/dolphinscheduler"}
运行脚本安装 DolphinScheduler,会将程序文件拷贝到所有节点的 /home/dolphinscheduler/dolphinscheduler
目录:
[dolphinscheduler@dsnode1 ~]$ bash apache-dolphinscheduler-3.1.6-bin/bin/install.sh
停止所有服务:
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/stop-all.sh
启动所有服务:
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/start-all.sh
查看服务状态:
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/status-all.sh
====================== dolphinscheduler server config =============================
1.dolphinscheduler server node config hosts:[ dsnode1,dsnode2,dsnode3 ]
2.master server node config hosts:[ dsnode1,dsnode2 ]
3.worker server node config hosts:[ dsnode1:default,dsnode2:default,dsnode3:default ]
4.alert server node config hosts:[ dsnode3 ]
5.api server node config hosts:[ dsnode3 ]
====================== dolphinscheduler server status =============================
node server state
dsnode1 Begin status master-server......
master-server [ RUNNING ]
End status master-server.
dsnode2 Begin status master-server......
master-server [ RUNNING ]
End status master-server.
dsnode1 Begin status worker-server......
worker-server [ RUNNING ]
End status worker-server.
dsnode2 Begin status worker-server......
worker-server [ RUNNING ]
End status worker-server.
dsnode3 Begin status worker-server......
worker-server [ RUNNING ]
End status worker-server.
dsnode3 Begin status alert-server......
alert-server [ RUNNING ]
End status alert-server.
dsnode3 Begin status api-server......
api-server [ RUNNING ]
End status api-server.
登录 DolphinScheduler
正常启动后,浏览器访问地址 http://192.168.44.142:12345/dolphinscheduler/ui (Api Server 所在主机)进入登录页面。默认的用户名为:admin,密码为:dolphinscheduler123。
功能
任务
DataX
在 官方网站 下载 DataX 安装包到 Worker 节点的 dolphinscheduler
用户主目录下。在配置文件 /home/dolphinscheduler/dolphinscheduler/bin/env/dolphinscheduler_env.sh
中指定了 /opt/soft/datax
为 DataX 默认目录,将 DataX 安装包解压为该目录,然后将其拷贝到其他 Worker 节点。
[root@dsnode1 ~]# mkdir /opt/soft
[root@dsnode1 ~]# chown dolphinscheduler:dolphinscheduler /opt/soft
[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ tar -xvzf datax.tar.gz -C /opt/soft/
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/datax/ dolphinscheduler@dsnode2:/opt/soft/
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/datax/ dolphinscheduler@dsnode3:/opt/soft/
在每个节点创建 Python 的软连接:
[root@bdatnode1 ~]# python -V
Python 2.7.5
[dolphinscheduler@bdatnode1 ~]$ mkdir -p /opt/soft/python/bin/
[dolphinscheduler@bdatnode1 ~]$ ln -s /bin/python /opt/soft/python/bin/python2.7
[dolphinscheduler@bdatnode1 ~]$ ll /opt/soft/python/bin/
total 0
lrwxrwxrwx 1 dolphinscheduler dolphinscheduler 11 May 23 15:04 python2.7 -> /bin/python
阿里巴巴官方的 DataX 没有专门针对 Greenplum 的接口组件,只能使用 PostgreSQL 的接口组件,性能较差。可以使用其他包含 gpdbwriter
接口组件的 DataX 版本,例如:HashDataInc/DataX 或者 BoomLee/DataX
注意:
配置到 Oracle 的连接时,如果使用的是服务命,则 jdbcUrl 的格式为(即端口和服务名之间为斜杠):
"jdbc:oracle:thin:@192.168.44.135:1521/oratest"
例子:配置从 Oracle 读取数据写入到 Greenplum
{
"job": {
"setting": {
"speed": {
"channel": 3
},
"errorLimit": {
"record": 2,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "oraclereader",
"parameter": {
"username": "orcl",
"password": "123456",
"column": ["*"],
"splitPk": "id",
"connection": [
{
"table": [
"orcl.testtab1"
],
"jdbcUrl": [
"jdbc:oracle:thin:@192.168.44.135:1521/oratest"
]
}
]
}
},
"writer": {
"name": "postgresqlwriter",
"parameter": {
"username": "tester",
"password": "123456",
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:postgresql://192.168.44.160:5432/pgtest",
"table": [
"test.testtab1"
]
}
]
}
}
}
]
}
}
SeaTunnel
在 官方网站 下载 SeaTunnel 安装包到 Worker 节点的 dolphinscheduler
用户主目录下。在配置文件 /home/dolphinscheduler/dolphinscheduler/bin/env/dolphinscheduler_env.sh
中指定了 /opt/soft/seatunnel
为 SeaTunnel 默认目录,将 SeaTunnel 安装包解压为该目录,然后将其拷贝到其他 Worker 节点。
[root@dsnode1 ~]# su - dolphinscheduler
[dolphinscheduler@dsnode1 ~]$ tar -xvzf apache-seatunnel-incubating-2.3.1-bin.tar.gz -C /opt/soft/
[dolphinscheduler@dsnode1 ~]$ mv /opt/soft/apache-seatunnel-incubating-2.3.1/ /opt/soft/seatunnel
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/seatunnel/ dolphinscheduler@dsnode2:/opt/soft/
[dolphinscheduler@dsnode1 ~]$ scp -r /opt/soft/seatunnel/ dolphinscheduler@dsnode3:/opt/soft/
数据源
Oracle
进入 Oracle 官方网站 JDBC 驱动下载页面,下载对应 Oracle 版本及 JDK 版本的驱动,如果 Oracle 版本为 11g,则官方网站已经不提供该版本的驱动了,可以去 Maven 仓库下载。这里下载的是 ojdbc6-11.2.0.4.jar。
将驱动放到对应节点的 api-server/libs
以及 worker-server/libs
目录下。
然后重启 DolphinScheduler:
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/stop-all.sh
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/start-all.sh
就可以登录到 DolphinScheduler 创建到 Oracle 数据库的数据源了。
PostgreSQL
DolphinScheduler 默认安装了 PostgreSQL 驱动,可用于 PostgreSQL 及 Greenplum。如果缺少驱动,进入 PostgreSQL 官方网站 JDBC 驱动下载页面,下载对应 JDK 版本的驱动。
将驱动放到对应节点的 api-server/libs
以及 worker-server/libs
目录下。
然后重启 DolphinScheduler:
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/stop-all.sh
[dolphinscheduler@dsnode1 ~]$ ./dolphinscheduler/bin/start-all.sh
就可以登录到 DolphinScheduler 创建到 PostgreSQL 数据库的数据源了。