Greenplum Administration

Stone大约 31 分钟

Greenplum Administration

注意:

此文档对应的 Greenplum 版本为 6.24.3,基于 PostgreSQL 的版本为 9.4.26。

Overview

Greenplum 是一个基于开源的 PostgreSQL 数据库的数据仓库解决方案,专门用于大规模数据存储和分析。它可以在集群中处理 PB 级别的数据,并提供了并行处理,负载均衡和高可用性等功能,适用于企业级大数据分析应用。详细信息参考官方文档open in new window

Greenplum Architecture

Platform Requirements

Operating Systems

Greenplum Database 6 支持以下操作系统版本:

  • Red Hat Enterprise Linux 64-bit 8.7 or later (As of Greenplum Database version 6.20)
  • Red Hat Enterprise Linux 64-bit 7.3 or later
  • Red Hat Enterprise Linux 64-bit 6.x
  • CentOS 64-bit 7.3 or later
  • CentOS 64-bit 6.x
  • Ubuntu 18.04 LTS
  • Oracle Linux 64-bit 7, using the Red Hat Compatible Kernel (RHCK)

注意:

RedHat 6.x 和 CentOS 6.x 系统上启用基于资源组的负载管理时,存在性能问题,不建议使用。

Hardware and Network

最小硬件需求如下:

ItemDescription
Minimum CPUAny x86_64 compatible CPU
Minimum Memory16 GB RAM per server
Disk Space Requirements150MB per host for Greenplum installation
Approximately 300MB per segment instance for metadata
Cap disk capacity at 70% full to accommodate temporary files and prevent performance degradation
Network Requirements10 Gigabit Ethernet within the arrayNIC bonding is recommended when multiple interfaces are present
Greenplum Database can use either IPV4 or IPV6 protocols.

Storage

Greenplum Database 仅支持 XFS 文件系统。

Installing and Upgrading

How to Get Greenplum

GitHub Greenplum 仓库open in new window 页面选择 Releasesopen in new window 进入 Greenplum 下载页面。选择操作系统和版本,下载对应的安装包。

Greenplum 版本:6.24.3open in new window

操作系统:Red Hat Enterprise Linux

操作系统版本:Red Hat Enterprise Linux 7 (x86, 64-bit)

Download Packages: RPM Bundle

Configuring Your Systems

使用 3 台主机来部署集群,主机信息如下:

主机名IP操作系统CPU内存角色
gpnode1192.168.44.160RHEL 7.92路16核32 GBMaster,Segment
gpnode2192.168.44.161RHEL 7.92路16核32 GBStandby Master,Segment
gpnode3192.168.44.162RHEL 7.92路16核32 GBSegment
[root@gpnode1 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.9 (Maipo)
[root@gpnode1 ~]# grep 'physical id' /proc/cpuinfo | uniq | wc -l
2
[root@gpnode1 ~]# grep 'processor' /proc/cpuinfo | uniq | wc -l
16
[root@gpnode1 ~]# cat /proc/meminfo | grep MemTotal
MemTotal:       32778476 kB

如未做特殊说明,以下操作均需要在集群所有主机上进行。

Deactivate SELinux

禁用 SELinux。

修改文件 /etc/selinux/config,将 SELINUX=enforcing 修改为 SELINUX=disabled

[root@gpnode1 ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

修改完成后重启主机。

[root@gpnode1 ~]# init 6

重启后确认 SELinux 状态为 disabled

[root@gpnode1 ~]# sestatus 
SELinux status:                 disabled

Deactivate Firewall

关闭防火墙,禁止开机启动。

[root@gpnode1 ~]# systemctl stop firewalld.service
[root@gpnode1 ~]# systemctl disable firewalld.service
[root@gpnode1 ~]# systemctl status firewalld.service 
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
The hosts File

配置本地域名解析,将集群中所有机器的主机名和 IP 地址写入到 /etc/hosts 文件。

[root@gpnode1 ~]# echo "192.168.44.160   gpnode1" >> /etc/hosts
[root@gpnode1 ~]# echo "192.168.44.161   gpnode2" >> /etc/hosts
[root@gpnode1 ~]# echo "192.168.44.162   gpnode3" >> /etc/hosts
[root@gpnode1 ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.44.160   gpnode1
192.168.44.161   gpnode2
192.168.44.162   gpnode3
The sysctl.conf File

使用下面公式计算 kernel.shmallkernel.shmmax,为物理内存的一半:

[root@gpnode1 ~]# echo "kernel.shmall =" $(expr $(getconf _PHYS_PAGES) / 2) 
kernel.shmall = 4097309
[root@gpnode1 ~]# echo "kernel.shmmax =" $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
kernel.shmmax = 16782577664

使用下面公式计算 vm.min_free_kbytes ,推荐为物理内存的 3%,不要超过物理内存的 5%:

[root@gpnode1 ~]# awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo
vm.min_free_kbytes = 983354

将要调整的内存参数添加到文件 /etc/sysctl.conf 并重新加载。

[root@gpnode1 ~]# vi /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).

kernel.shmall = 4097309
kernel.shmmax = 16782577664
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95

net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 250 2048000 200 8192
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ipfrag_high_thresh = 41943040
net.ipv4.ipfrag_low_thresh = 31457280
net.ipv4.ipfrag_time = 60
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.min_free_kbytes = 983354
kernel.core_pattern=/tmp/core.%h.%t

# when Mem <= 64GB
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10

# when Mem > 64GB
#vm.dirty_background_ratio = 0
#vm.dirty_ratio = 0
#vm.dirty_background_bytes = 1610612736
#vm.dirty_bytes = 4294967296

[root@gpnode1 ~]# sysctl -p
kernel.shmall = 4097309
kernel.shmmax = 16782577664
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95
net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 250 2048000 200 8192
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ipfrag_high_thresh = 41943040
net.ipv4.ipfrag_low_thresh = 31457280
net.ipv4.ipfrag_time = 60
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.min_free_kbytes = 983354
kernel.core_pattern = /tmp/core.%h.%t
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10
System Resources Limits

修改文件 /etc/security/limits.conf ,添加以下参数调整系统资源限制。

[root@gpnode1 ~]# vi /etc/security/limits.conf
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
* soft core unlimited

[root@gpnode3 ~]# logout
[root@gpnode1 ~]# ulimit -u
131072
XFS Mount Options

调整 XFS 分区挂载选项。

[root@gpnode1 ~]# mount | grep root
/dev/mapper/vg1-root on / type xfs (rw,relatime,attr2,inode64,noquota)
[root@gpnode1 ~]# vi /etc/fstab 
/dev/mapper/vg1-root    /                       xfs     defaults,rw,nodev,noatime,nobarrier,inode64        0 0
[root@gpnode1 ~]# mount -o remount /
[root@gpnode1 ~]# mount | grep root
/dev/mapper/vg1-root on / type xfs (rw,nodev,noatime,attr2,nobarrier,inode64,noquota)
Disk I/O Settings

设置分区的 Read-ahead 为 16384。

[root@gpnode1 ~]# /sbin/blockdev --getra /dev/mapper/vg1-root
8192
[root@gpnode1 ~]# /sbin/blockdev --setra 16384 /dev/mapper/vg1-root
[root@gpnode1 ~]# /sbin/blockdev --getra /dev/mapper/vg1-root
16384
[root@gpnode1 ~]# vi /etc/rc.d/rc.local 
/sbin/blockdev --setra 16384 /dev/mapper/vg1-root
[root@gpnode1 ~]# chmod a+x /etc/rc.d/rc.local

对于 Greenplum 数据库,不同的存储设备和操作系统,对应的推荐磁盘 I/O 调度调度策略为:

Storage Device TypeOSRecommended Scheduler Policy
Non-Volatile Memory Express (NVMe)RHEL 7
RHEL 8
Ubuntu
none
Solid-State Drives (SSD)RHEL 7noop
Solid-State Drives (SSD)RHEL 8
Ubuntu
none
OtherRHEL 7deadline
OtherRHEL 8 Ubuntumq-deadline

目前的调度策略已经为 deadline

[root@gpnode1 ~]# cat /sys/block/sda/queue/scheduler 
noop [deadline] cfq 

如果不是,使用以下命令设置:

# grubby --update-kernel=ALL --args="elevator=deadline"
Transparent Huge Pages

Transparent Huge Pages (THP) 会降低 Greenplum 数据库性能,需要停用。

[root@gpnode1 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled 
[always] madvise never
[root@gpnode1 ~]# grubby --update-kernel=ALL --args="transparent_hugepage=never"
[root@gpnode1 ~]# init 6
[root@gpnode1 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled 
always madvise [never]
IPC Object Removal

如果操作系统版本为 RHEL 7.2 或者 CentOS 7.2,则需要修改文件 /etc/systemd/logind.conf 禁用 RemoveIPC

[root@gpnode1 ~]# vi /etc/systemd/logind.conf
RemoveIPC=no
[root@gpnode1 ~]# systemctl daemon-reload
[root@gpnode1 ~]# systemctl restart systemd-logind
SSH Connection Threshold

为避免客户端到主机的 SSH 连接超过默认阈值,修改文件 /etc/ssh/sshd_config 中的 MaxStartupsMaxSessions 参数。

[root@gpnode1 ~]# egrep 'MaxStartups|MaxSessions' /etc/ssh/sshd_config
#MaxSessions 10
#MaxStartups 10:30:100
[root@gpnode1 ~]# echo "MaxSessions 200" >> /etc/ssh/sshd_config
[root@gpnode1 ~]# echo "MaxStartups 200" >> /etc/ssh/sshd_config
[root@gpnode1 ~]# egrep 'MaxStartups|MaxSessions' /etc/ssh/sshd_config
#MaxSessions 10
#MaxStartups 10:30:100
MaxSessions 200
MaxStartups 200

Synchronizing System Clocks

需要保持集群所有主机的时间同步。RHEL 6 使用 ntpd 配置时间同步,RHEL 7 使用 chronyd 配置时间同步。

先在 Master 节点上配置同步时间的服务器为内网的 NTP 服务器:

[root@gpnode1 ~]# vi /etc/chrony.conf 
server time.stone.com iburst
[root@gpnode1 ~]# systemctl restart chronyd.service
[root@gpnode1 ~]# chronyc makestep
200 OK
[root@gpnode1 ~]# chronyc sourcestats
210 Number of sources = 1
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
TIME.STONE.COM              6   3   325     +9.811     25.043   +498us   871us

再在 Standby Mater 节点上配置同步时间的服务器为 Master 节点和内网的 NTP 服务器:

[root@gpnode2 ~]# vi /etc/chrony.conf 
server 192.168.44.160 iburst prefer
server time.stone.com iburst
[root@gpnode2 ~]# systemctl restart chronyd.service
[root@gpnode2 ~]# chronyc makestep
200 OK
[root@gpnode2 ~]# chronyc sourcestats
210 Number of sources = 2
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
gpnode1                   0   0     0     +0.000   2000.000     +0ns  4000ms
TIME.STONE.COM              6   3   322     +2.687     25.847    +31us   723us

再在 Segment 节点上配置同步时间的服务器为 Master 节点和 Standby Mater 节点:

[root@gpnode3 ~]# vi /etc/chrony.conf
server 192.168.44.160 iburst prefer
server 192.168.44.161 iburst
[root@gpnode3 ~]# systemctl restart chronyd.service
[root@gpnode3 ~]# chronyc makestep
200 OK
[root@gpnode3 ~]# chronyc sourcestats
210 Number of sources = 2
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
gpnode1                   0   0     0     +0.000   2000.000     +0ns  4000ms
gpnode2                   0   0     0     +0.000   2000.000     +0ns  4000ms

Creating the Greenplum Administrative User

在每个节点上创建专门的操作系统用户,用户名通常为 gpadmin ,来管理 Greenplum 数据库。

注意:

不能使用 root 用户运行 Greenplum 数据库。

[root@gpnode1 ~]# groupadd -g 2000 gpadmin
[root@gpnode1 ~]# useradd -r -m -g gpadmin -u 3000 gpadmin
[root@gpnode1 ~]# echo "123456" | passwd --stdin gpadmin
[root@gpnode1 ~]# id gpadmin
uid=3000(gpadmin) gid=2000(gpadmin) groups=2000(gpadmin)

注意:

所有节点的 gpadmin 用户的 uidgid 需一致。

创建用户后,配置所有节点的 SSH 互信,互相之间可以通过 SSH 免密码访问。这里使用 Orace 的 sshUserSetup.sh 脚本来配置,只需要在其中一个节点运行即可。

[root@gpnode1 ~]# ./sshUserSetup.sh -user gpadmin -hosts "gpnode1 gpnode2 gpnode3" -advanced -noPromptPassphrase

如果出现报错 Bad owner or permissions on /home/gpadmin/.ssh/config,原因是RHEL 7.9 对 config 文件的权限进行了调整,参考文档:Doc ID 2923516.1open in new window,修改脚本再次运行:

[root@gpnode1 ~]# grep -n "chmod 644" sshUserSetup.sh | grep config
450:chmod 644 $HOME/.ssh/config
496:     $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \"  mkdir -p .ssh ; chmod og-w . .ssh;   touch .ssh/authorized_keys .ssh/known_hosts;  chmod 644 .ssh/authorized_keys  .ssh/known_hosts; cp  .ssh/authorized_keys .ssh/authorized_keys.tmp ;  cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f  .sshconfig ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config\""  | tee -a $LOGFILE
572:chmod 644 $HOME/.ssh/config

[root@gpnode1 ~]# vi sshUserSetup.sh
[root@gpnode1 ~]# grep -n "chmod 600" sshUserSetup.sh
450:chmod 600 $HOME/.ssh/config
496:     $SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \"  mkdir -p .ssh ; chmod og-w . .ssh;   touch .ssh/authorized_keys .ssh/known_hosts;  chmod 644 .ssh/authorized_keys  .ssh/known_hosts; cp  .ssh/authorized_keys .ssh/authorized_keys.tmp ;  cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f  .ssh/config ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config ; chmod 600 .ssh/config\""  | tee -a $LOGFILE
572:chmod 600 $HOME/.ssh/config

[root@gpnode1 ~]# ./sshUserSetup.sh -user gpadmin -hosts "gpnode1 gpnode2 gpnode3" -advanced -noPromptPassphrase
[root@gpnode1 ~]# ./sshUserSetup.sh -user root -hosts "gpnode1 gpnode2 gpnode3" -advanced -noPromptPassphrase

在所有节点为用户 gpadmin 配置 sudo

[root@gpnode1 ~]# visudo
## Allows people in group wheel to run all commands
%wheel  ALL=(ALL)       ALL

## Same thing without a password
%wheel  ALL=(ALL)       NOPASSWD: ALL
[root@gpnode1 ~]# usermod -aG wheel gpadmin

Installing the Greenplum Database Software

Installing Greenplum Database

Downloading the Greenplum Database Server Software

GitHub Greenplum 仓库open in new window 页面选择 Releasesopen in new window 进入 Greenplum 下载页面。选择操作系统和版本,下载对应的安装包。

Greenplum 版本:6.24.3open in new window

操作系统:Red Hat Enterprise Linux

操作系统版本:Red Hat Enterprise Linux 7 (x86, 64-bit)

Download Packages: RPM Bundle

Installing the Greenplum Database Software

将安装包上传到集群所有主机的 gpadmin 用户主目录下,然后在每个主机上使用 root 用户进行安装。

[root@gpnode1 ~]# yum install /home/gpadmin/open-source-greenplum-db-6.24.3-rhel7-x86_64.rpm
[root@gpnode1 ~]# chown -R gpadmin:gpadmin /usr/local/greenplum*

Confirming Your Installation

在 Master 节点执行以下命令验证 Greenplum 数据库安装和配置是否正确,并验证节点间的 SSH 互信:

[root@gpnode1 ~]# su - gpadmin
[gpadmin@gpnode1 ~]$ vi hostfile_exkeys
gpnode1
gpnode2
gpnode3
[gpadmin@gpnode1 ~]$ source /usr/local/greenplum-db-6.24.3/greenplum_path.sh
[gpadmin@gpnode1 ~]$ gpssh -f hostfile_exkeys -e 'ls -l /usr/local/greenplum-db-6.24.3'
[gpnode1] ls -l /usr/local/greenplum-db-6.24.3
[gpnode1] total 172
[gpnode1] drwxr-xr-x 7 gpadmin gpadmin   4096 May 15 22:12 bin
[gpnode1] -rw-r--r-- 1 gpadmin gpadmin   1547 May  5 05:13 COPYRIGHT
[gpnode1] drwxr-xr-x 3 gpadmin gpadmin     22 May 15 22:12 docs
[gpnode1] drwxr-xr-x 2 gpadmin gpadmin     27 May 15 22:12 etc
[gpnode1] drwxr-xr-x 3 gpadmin gpadmin     20 May 15 22:12 ext
[gpnode1] -rw-r--r-- 1 gpadmin gpadmin   1123 May  5 05:13 greenplum_path.sh
[gpnode1] drwxr-xr-x 4 gpadmin gpadmin   4096 May 15 22:12 include
[gpnode1] drwxr-xr-x 5 gpadmin gpadmin   4096 May 15 22:13 lib
[gpnode1] drwxr-xr-x 2 gpadmin gpadmin     21 May 15 22:13 libexec
[gpnode1] -rw-r--r-- 1 gpadmin gpadmin  11358 May  5 05:13 LICENSE
[gpnode1] -rw-r--r-- 1 gpadmin gpadmin    480 May  5 05:13 NOTICE
[gpnode1] -rw-r--r-- 1 gpadmin gpadmin 133210 May  5 05:13 open_source_license_greenplum_database.txt
[gpnode1] drwxr-xr-x 2 gpadmin gpadmin   4096 May 15 22:13 sbin
[gpnode1] drwxr-xr-x 4 gpadmin gpadmin     41 May 15 22:13 share
[gpnode2] ls -l /usr/local/greenplum-db-6.24.3
[gpnode2] total 172
[gpnode2] drwxr-xr-x 7 gpadmin gpadmin   4096 May 15 22:16 bin
[gpnode2] -rw-r--r-- 1 gpadmin gpadmin   1547 May  5 05:13 COPYRIGHT
[gpnode2] drwxr-xr-x 3 gpadmin gpadmin     22 May 15 22:16 docs
[gpnode2] drwxr-xr-x 2 gpadmin gpadmin     27 May 15 22:16 etc
[gpnode2] drwxr-xr-x 3 gpadmin gpadmin     20 May 15 22:16 ext
[gpnode2] -rw-r--r-- 1 gpadmin gpadmin   1123 May  5 05:13 greenplum_path.sh
[gpnode2] drwxr-xr-x 4 gpadmin gpadmin   4096 May 15 22:16 include
[gpnode2] drwxr-xr-x 5 gpadmin gpadmin   4096 May 15 22:16 lib
[gpnode2] drwxr-xr-x 2 gpadmin gpadmin     21 May 15 22:16 libexec
[gpnode2] -rw-r--r-- 1 gpadmin gpadmin  11358 May  5 05:13 LICENSE
[gpnode2] -rw-r--r-- 1 gpadmin gpadmin    480 May  5 05:13 NOTICE
[gpnode2] -rw-r--r-- 1 gpadmin gpadmin 133210 May  5 05:13 open_source_license_greenplum_database.txt
[gpnode2] drwxr-xr-x 2 gpadmin gpadmin   4096 May 15 22:16 sbin
[gpnode2] drwxr-xr-x 4 gpadmin gpadmin     41 May 15 22:16 share
[gpnode3] ls -l /usr/local/greenplum-db-6.24.3
[gpnode3] total 172
[gpnode3] drwxr-xr-x 7 gpadmin gpadmin   4096 May 15 22:16 bin
[gpnode3] -rw-r--r-- 1 gpadmin gpadmin   1547 May  5 05:13 COPYRIGHT
[gpnode3] drwxr-xr-x 3 gpadmin gpadmin     22 May 15 22:16 docs
[gpnode3] drwxr-xr-x 2 gpadmin gpadmin     27 May 15 22:16 etc
[gpnode3] drwxr-xr-x 3 gpadmin gpadmin     20 May 15 22:16 ext
[gpnode3] -rw-r--r-- 1 gpadmin gpadmin   1123 May  5 05:13 greenplum_path.sh
[gpnode3] drwxr-xr-x 4 gpadmin gpadmin   4096 May 15 22:16 include
[gpnode3] drwxr-xr-x 5 gpadmin gpadmin   4096 May 15 22:16 lib
[gpnode3] drwxr-xr-x 2 gpadmin gpadmin     21 May 15 22:16 libexec
[gpnode3] -rw-r--r-- 1 gpadmin gpadmin  11358 May  5 05:13 LICENSE
[gpnode3] -rw-r--r-- 1 gpadmin gpadmin    480 May  5 05:13 NOTICE
[gpnode3] -rw-r--r-- 1 gpadmin gpadmin 133210 May  5 05:13 open_source_license_greenplum_database.txt
[gpnode3] drwxr-xr-x 2 gpadmin gpadmin   4096 May 15 22:16 sbin
[gpnode3] drwxr-xr-x 4 gpadmin gpadmin     41 May 15 22:16 share

无需密码,即可返回所有节点的安装目录。

Creating the Data Storage Areas

Creating Data Storage Areas on the Master and Standby Master Hosts

在 Master 节点创建数据目录,存储系统目录表和系统元数据。

[root@gpnode1 ~]# mkdir -p /data/master
[root@gpnode1 ~]# chown -R gpadmin:gpadmin /data

在 Standby Master 节点创建数据目录,存储系统目录表和系统元数据。

[root@gpnode2 ~]# mkdir -p /data/master
[root@gpnode2 ~]# chown -R gpadmin:gpadmin /data

Creating Data Storage Areas on Segment Hosts

在 Segment 节点创建数据目录,存储业务数据。

创建文件 hostfile_gpssh_segonly 存放所有 Segment 节点主机名:

[root@gpnode1 ~]# vi hostfile_gpssh_segonly
gpnode1
gpnode2
gpnode3

使用 gpssh 命令创建主数据目录和镜像数据目录:

[root@gpnode1 ~]# source /usr/local/greenplum-db/greenplum_path.sh
[root@gpnode1 ~]# gpssh -f hostfile_gpssh_segonly -e 'mkdir -p /data/primary'
[gpnode3] mkdir -p /data/primary
[gpnode2] mkdir -p /data/primary
[gpnode1] mkdir -p /data/primary
[root@gpnode1 ~]# gpssh -f hostfile_gpssh_segonly -e 'mkdir -p /data/mirror'
[gpnode2] mkdir -p /data/mirror
[gpnode1] mkdir -p /data/mirror
[gpnode3] mkdir -p /data/mirror
[root@gpnode1 ~]# gpssh -f hostfile_gpssh_segonly -e 'chown -R gpadmin:gpadmin /data/*'
[gpnode3] chown -R gpadmin:gpadmin /data/*
[gpnode2] chown -R gpadmin:gpadmin /data/*
[gpnode1] chown -R gpadmin:gpadmin /data/*

Validating Your Systems

安装完成后使用 gpcheckperf 命令测试网络,磁盘 I/O 和内存的性能。

Validating Network Performance

测试网络性能:

[root@gpnode1 ~]# su - gpadmin
[gpadmin@gpnode1 ~]$ vi hostfile_gpchecknet
gpnode1
gpnode2
gpnode3
[gpadmin@gpnode1 ~]$ source /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@gpnode1 ~]$ gpcheckperf -f hostfile_gpchecknet -r N -d /tmp > net.out

Validating Disk I/O and Memory Bandwidth

测试磁盘 I/O 和内存性能,创建 hostfile_gpcheckperf 文件包含所有 Segment 节点主机名。

[root@gpnode1 ~]# su - gpadmin
[gpadmin@gpnode1 ~]$ vi hostfile_gpcheckperf
gpnode1
gpnode2
gpnode3
[gpadmin@gpnode1 ~]$ source /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@gpnode1 ~]$ gpcheckperf -f hostfile_gpcheckperf -r ds -D -d /data/primary -d  /data/mirror > disk.out

Initializing a Greenplum Database System

由于 Greenplum 是分布式数据库,初始化 Greenplum 数据库就是初始化多个 PostgreSQL 数据库实例,称之为 Segment 实例。在 Greenplum 数据库完成初始化并启动后,就可以通过连接到 Master 创建和管理数据库,就像使用 PostgreSQL 数据库一样。

Initializing Greenplum Database

在 Master 节点,登录到 gpadmin 用户,执行环境变量脚本 greenplum_path.sh

[root@gpnode1 ~]# su - gpadmin
[gpadmin@gpnode1 ~]$ source /usr/local/greenplum-db/greenplum_path.sh 
Creating the Initialization Host File

创建初始化主机文件 hostfile_gpinitsystem,包括所有 Segment 节点主机名:

[gpadmin@gpnode1 ~]$ mkdir gpconfigs
[gpadmin@gpnode1 ~]$ vi gpconfigs/hostfile_gpinitsystem
gpnode1
gpnode2
gpnode3
Creating the Greenplum Database Configuration File

创建 Greenplum 数据库配置文件,其中必须包含 1 个 Master 实例和至少 2 个 Segment 实例。

[gpadmin@gpnode1 ~]$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
[gpadmin@gpnode1 ~]$ vi /home/gpadmin/gpconfigs/gpinitsystem_config 
# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="Greenplum Data Platform"

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers 
#### are calculated.
PORT_BASE=6000

#### File system location(s) where primary segment data directories 
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in 
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data/primary /data/primary)

#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=gpnode1

#### File system location where the master data directory 
#### will be created.
MASTER_DIRECTORY=/data/master

#### Port number for the master instance. 
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.
ENCODING=UNICODE

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers 
#### are calculated.
MIRROR_PORT_BASE=7000

#### File system location(s) where mirror segment data directories 
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the 
#### DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/mirror /data/mirror)


################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.
#DATABASE_NAME=name_of_database

#### Specify the location of the host address file here instead of
#### with the -h option of gpinitsystem.
#MACHINE_LIST_FILE=/home/gpadmin/gpconfigs/hostfile_gpinitsystem
Running the Initialization Utility

使用 gpinitsystem 命令创建 Greenplum 数据库系统,必须使用 gpadmin 用户并先执行环境变量脚本 greenplum_path.sh

[root@gpnode1 ~]# su - gpadmin 
[gpadmin@gpnode1 ~]$ source /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@gpnode1 ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem -s gpnode2 --mirror-mode=spread
20230516:22:06:43:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20230516:22:06:43:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20230516:22:06:43:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20230516:22:06:43:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Locale set to en_US.utf8
20230516:22:06:43:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20230516:22:06:43:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20230516:22:06:44:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Checking configuration parameters, Completed
20230516:22:06:44:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
...
20230516:22:06:45:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Configuring build for standard array
20230516:22:06:45:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Sufficient hosts for spread mirroring request
20230516:22:06:45:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20230516:22:06:45:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Building primary segment instance array, please wait...
......
20230516:22:06:48:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Building spread mirror array type , please wait...
......
20230516:22:06:51:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Checking Master host
20230516:22:06:51:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Checking new segment hosts, please wait...
............
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Checking new segment hosts, Completed
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:---------------------------------------
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master Configuration
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:---------------------------------------
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master instance name       = Greenplum Data Platform
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master hostname            = gpnode1
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master port                = 5432
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master instance dir        = /data/master/gpseg-1
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master LOCALE              = en_US.utf8
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Greenplum segment prefix   = gpseg
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master Database            = 
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master connections         = 250
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master buffers             = 128000kB
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Segment connections        = 750
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Segment buffers            = 128000kB
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Checkpoint segments        = 8
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Encoding                   = UNICODE
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Postgres param file        = Off
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db-6.24.3/bin/initdb
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db-6.24.3/lib
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-HEAP_CHECKSUM is           = on
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-HBA_HOSTNAMES is           = 0
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Ulimit check               = Passed
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master IP address [1]      = ::1
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master IP address [2]      = 192.168.44.160
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master IP address [3]      = 192.168.122.1
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master IP address [4]      = fe80::7c0b:8c2a:9724:267a
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master IP address [5]      = fe80::d292:a536:80b0:98a6
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Master IP address [6]      = fe80::f42d:b12f:f983:8f8c
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby Master             = gpnode2
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Number of primary segments = 2
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby IP address         = ::1
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby IP address         = 192.168.44.161
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby IP address         = 192.168.122.1
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby IP address         = fe80::7c0b:8c2a:9724:267a
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby IP address         = fe80::d292:a536:80b0:98a6
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby IP address         = fe80::f42d:b12f:f983:8f8c
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total Database segments    = 6
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Trusted shell              = ssh
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Number segment hosts       = 3
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Mirror port base           = 7000
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Number of mirror segments  = 2
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Mirroring config           = ON
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Mirroring type             = Spread
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:----------------------------------------
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:----------------------------------------
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode1       6000    gpnode1       /data/primary/gpseg0    2
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode1       6001    gpnode1       /data/primary/gpseg1    3
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode2       6000    gpnode2       /data/primary/gpseg2    4
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode2       6001    gpnode2       /data/primary/gpseg3    5
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode3       6000    gpnode3       /data/primary/gpseg4    6
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode3       6001    gpnode3       /data/primary/gpseg5    7
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:---------------------------------------
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:---------------------------------------
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode2       7000    gpnode2       /data/mirror/gpseg0     8
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode3       7001    gpnode3       /data/mirror/gpseg1     9
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode3       7000    gpnode3       /data/mirror/gpseg2     10
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode1       7001    gpnode1       /data/mirror/gpseg3     11
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode1       7000    gpnode1       /data/mirror/gpseg4     12
20230516:22:07:04:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-gpnode2       7001    gpnode2       /data/mirror/gpseg5     13

Continue with Greenplum creation Yy|Nn (default=N):
> y
20230516:22:07:49:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Building the Master instance database, please wait...
20230516:22:07:53:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Starting the Master in admin mode
20230516:22:07:53:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20230516:22:07:53:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
......
20230516:22:07:54:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
..............
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:------------------------------------------------
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Parallel process exit status
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:------------------------------------------------
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total processes marked as completed           = 6
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total processes marked as killed              = 0
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total processes marked as failed              = 0
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:------------------------------------------------
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Removing back out file
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-No errors generated from parallel processes
20230516:22:08:08:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/master/gpseg-1
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Gathering information and validating the environment...
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Obtaining Segment details from master...
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 6.24.3 build commit:25d3498a400ca5230e81abb94861f23389315213 Open Source'
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='smart'
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Master segment instance directory=/data/master/gpseg-1
20230516:22:08:08:018289 gpstop:gpnode1:gpadmin-[INFO]:-Stopping master segment and waiting for user connections to finish ...
server shutting down
20230516:22:08:09:018289 gpstop:gpnode1:gpadmin-[INFO]:-Attempting forceful termination of any leftover master process
20230516:22:08:09:018289 gpstop:gpnode1:gpadmin-[INFO]:-Terminating processes for segment /data/master/gpseg-1
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Gathering information and validating the environment...
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.24.3 build commit:25d3498a400ca5230e81abb94861f23389315213 Open Source'
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Greenplum Catalog Version: '301908232'
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Starting Master instance in admin mode
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Obtaining Segment details from master...
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Setting new master era
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Master Started...
20230516:22:08:11:018576 gpstart:gpnode1:gpadmin-[INFO]:-Shutting down master
20230516:22:08:13:018576 gpstart:gpnode1:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-Process results...
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-----------------------------------------------------
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-   Successful segment starts                                            = 6
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-----------------------------------------------------
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-Successfully started 6 of 6 segment instances 
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-----------------------------------------------------
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-Starting Master instance gpnode1 directory /data/master/gpseg-1 
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-Command pg_ctl reports Master gpnode1 instance active
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-Connecting to dbname='template1' connect_timeout=15
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-No standby master configured.  skipping...
20230516:22:08:14:018576 gpstart:gpnode1:gpadmin-[INFO]:-Database successfully started
20230516:22:08:14:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20230516:22:08:15:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Commencing parallel build of mirror segment instances
20230516:22:08:15:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
......
20230516:22:08:15:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
....
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:------------------------------------------------
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Parallel process exit status
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:------------------------------------------------
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total processes marked as completed           = 6
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total processes marked as killed              = 0
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Total processes marked as failed              = 0
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:------------------------------------------------
20230516:22:08:19:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Starting initialization of standby master gpnode2
20230516:22:08:19:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20230516:22:08:19:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Checking for data directory /data/master/gpseg-1 on gpnode2
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:------------------------------------------------------
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum standby master initialization parameters
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:------------------------------------------------------
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum master hostname               = gpnode1
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum master data directory         = /data/master/gpseg-1
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum master port                   = 5432
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum standby master hostname       = gpnode2
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum standby master port           = 5432
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum standby master data directory = /data/master/gpseg-1
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Greenplum update system catalog         = On
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-The packages on gpnode2 are consistent.
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Adding standby master to catalog...
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Database catalog updated successfully.
20230516:22:08:20:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Updating pg_hba.conf file...
20230516:22:08:21:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
20230516:22:08:22:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Starting standby master
20230516:22:08:22:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Checking if standby master is running on host: gpnode2  in directory: /data/master/gpseg-1
20230516:22:08:23:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
20230516:22:08:24:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20230516:22:08:24:021158 gpinitstandby:gpnode1:gpadmin-[INFO]:-Successfully created standby master on gpnode2
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Successfully completed standby master initialization
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Log file scan check passed
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Greenplum Database instance successfully created
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-------------------------------------------------------
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-To complete the environment configuration, please 
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/master/gpseg-1"
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-   or, use -d /data/master/gpseg-1 option for the Greenplum scripts
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-   Example gpstate -d /data/master/gpseg-1
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20230516.log
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Standby Master gpnode2 has been configured
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-To activate the Standby Master Segment in the event of Master
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-failure review options for gpactivatestandby
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-------------------------------------------------------
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-new array must be explicitly added to this file
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-located in the /usr/local/greenplum-db-6.24.3/docs directory
20230516:22:08:24:008943 gpinitsystem:gpnode1:gpadmin-[INFO]:-------------------------------------------------------

输入出现 Greenplum Database instance successfully created 说明创建成功。

如果创建失败了,则需要从日志中找到原因并更正,然后执行在创建过程中自动生成的回退脚本,该脚本会删除创建的数据目录,postgres 进程以及日志文件。例如:

[gpadmin@gpnode1 ~]$ bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20230516_220253

执行完该脚本后,就可以再次执行 gpinitsystem 命令进行初始化。

Setting the Greenplum Database Timezone

设置 Greenplum 数据库时区可防止 Greenplum 数据库在每次重启集群时选择时区,并为 Greenplum 数据库主实例和段实例设置时区。使用 gpconfig 查看和设置时区。

[root@gpnode1 ~]# su - gpadmin 
[gpadmin@gpnode1 ~]$ source /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@gpnode1 ~]$ gpconfig -s TimeZone
Values on all segments are consistent
GUC          : TimeZone
Master  value: Asia/Shanghai
Segment value: Asia/Shanghai

如果查询出来的时区和你需要的时区不一致,则使用如下命令设置并重启 Greenplum 数据库:

[gpadmin@gpnode1 ~]$ gpconfig -c TimeZone -v 'Asia/Shanghai'
[gpadmin@gpnode1 ~]$ gpstop -ra

Setting Greenplum Environment Variables

需要为所有节点的 gpadmin 用户配置环境变量。

[gpadmin@gpnode1 ~]$ vi .bash_profile 
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH

source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=postgres
export LD_PRELOAD=/lib64/libz.so.1 ps

[gpadmin@gpnode1 ~]$ source .bash_profile
[gpadmin@gpnode1 ~]$ scp .bash_profile gpnode2:`pwd`

测试创建数据库和用户。

[gpadmin@gpnode1 ~]$ psql 
psql (9.4.26)
Type "help" for help.

postgres=# create database tester;
CREATE DATABASE
postgres=# create user tester;
NOTICE:  resource queue required -- using default resource queue "pg_default"
CREATE ROLE
postgres=# alter user tester with password '123456';
ALTER ROLE
postgres=# alter database tester owner to tester;
ALTER DATABASE

Security Configuration

Configuring Client Authentication

首次初始化 Greenplum 数据库时,包含一个预定义的超级用户角色,与初始化 Greenplum 数据库的操作系统用户同名,也就是 gpadmin。默认情况下,系统配置为仅允许使用 gpadmin 本地连接数据库。

Allowing Connections to Greenplum Database

使用 Master 和 Standby Master 节点的文件 pg_hba.conf 配置外部客户端的连接和认证,在文件末尾添加一行,放行所有地址,所有用户对所有数据库的访问。

[gpadmin@gpnode1 ~]$ vi $MASTER_DATA_DIRECTORY/pg_hba.conf
host     all         all             0.0.0.0/0              md5
[gpadmin@gpnode1 ~]$ gpstop -u

注意:

生产环境建议通过指定 IP 地址段的方式放行授权客户端的访问。

Managing Data

Defining Database Objects

Creating and Managing Tablespaces

表空间允许数据库管理员在每台主机上创建多个目录来存储不同数据库或不同类型的数据。例如对于热数据和冷数据,可以使用不同性能的磁盘,创建不同的表空间来存放,以降低成本,提高性能。

需要在主机上创建目录来存放表空间数据文件,在 Greenplum 数据库中,必须在包括 Master,Standby Master 和所有 Segment 节点主机上创建对应的目录。

Creating a Tablespace

创建表空间目录,创建表空间,并设置数据库的默认表空间。

[gpadmin@gpnode1 ~]$ gpssh -f hostfile_exkeys -e 'mkdir -p /data/tablespace/testtbl'
[gpnode3] mkdir -p /data/tablespace/testtbl
[gpnode2] mkdir -p /data/tablespace/testtbl
[gpnode1] mkdir -p /data/tablespace/testtbl
[gpadmin@gpnode1 ~]$ psql 
psql (9.4.26)
Type "help" for help.

postgres=# CREATE TABLESPACE testtbl LOCATION '/data/tablespace/testtbl';
CREATE TABLESPACE

postgres=# alter database tester set tablespace testtbl;
ALTER DATABASE

System Administration

Configuring the Greenplum Database System

About Greenplum Database Master and Local Parameters

Greenplum 数据库配置文件 postgresql.conf 位于数据库实例的数据目录下,Master 实例和每个 Segment 实例都有其自己的配置文件。

[gpadmin@gpnode1 ~]$ ll /data/master/gpseg-1/postgresql.conf
-rw------- 1 gpadmin gpadmin 23665 May 16 22:07 /data/master/gpseg-1/postgresql.conf

[gpadmin@gpnode1 ~]$ ll /data/primary/gpseg0/postgresql.conf 
-rw------- 1 gpadmin gpadmin 23525 May 16 22:08 /data/primary/gpseg0/postgresql.conf

[gpadmin@gpnode1 ~]$ ll /data/primary/gpseg1/postgresql.conf 
-rw------- 1 gpadmin gpadmin 23525 May 16 22:08 /data/primary/gpseg1/postgresql.conf

[gpadmin@gpnode1 ~]$ ll /data/mirror/gpseg3/postgresql.conf
-rw------- 1 gpadmin gpadmin 23563 May 29 18:10 /data/mirror/gpseg3/postgresql.conf

[gpadmin@gpnode1 ~]$ ll /data/mirror/gpseg4/postgresql.conf
-rw------- 1 gpadmin gpadmin 23563 May 29 18:10 /data/mirror/gpseg4/postgresql.conf

Setting Configuration Parameters

大多配置参数是会话参数,可以在系统级别,数据库级别,角色级别或者会话级别设置会话参数。数据库用户可以在其会话中修改大部分会话参数,其余则需要超级用户权限。

Setting a Local Configuration Parameter

使用 gpconfig 修改所有Master 实例和 Segment 实例的 postgresql.conf 文件。

[gpadmin@gpnode1 ~]$ gpconfig -s log_statement
Values on all segments are consistent
GUC          : log_statement
Master  value: all
Segment value: none
[gpadmin@gpnode1 ~]$ gpconfig -s log_truncate_on_rotation
Values on all segments are consistent
GUC          : log_truncate_on_rotation
Master  value: off
Segment value: off
[gpadmin@gpnode1 ~]$ gpconfig -s gp_enable_global_deadlock_detector
Values on all segments are consistent
GUC          : gp_enable_global_deadlock_detector
Master  value: off
Segment value: off
[gpadmin@gpnode1 ~]$ gpconfig -s wal_segment_size
Values on all segments are consistent
GUC          : wal_segment_size
Master  value: 64MB
Segment value: 64MB
[gpadmin@bdatnode1 ~]$ gpconfig -s checkpoint_segments
Values on all segments are consistent
GUC          : checkpoint_segments
Master  value: 8
Segment value: 8

[gpadmin@gpnode1 ~]$ gpconfig -c log_statement -v none
20230529:19:55:25:018429 gpconfig:gpnode1:gpadmin-[INFO]:-completed successfully with parameters '-c log_statement -v none'
[gpadmin@gpnode1 ~]$ gpconfig -c log_truncate_on_rotation -v on
20230529:18:10:09:005160 gpconfig:gpnode1:gpadmin-[INFO]:-completed successfully with parameters '-c log_truncate_on_rotation -v on'
[gpadmin@gpnode1 ~]$ gpconfig -c gp_enable_global_deadlock_detector -v on
20230529:19:55:03:018187 gpconfig:gpnode1:gpadmin-[INFO]:-completed successfully with parameters '-c gp_enable_global_deadlock_detector -v on'

[gpadmin@gpnode1 ~]$ gpconfig -s log_statement
Values on all segments are consistent
GUC          : log_statement
Master  value: all
Segment value: none
[gpadmin@gpnode1 ~]$ gpconfig -s log_truncate_on_rotation
Values on all segments are consistent
GUC          : log_truncate_on_rotation
Master  value: off
Segment value: off
[gpadmin@gpnode1 ~]$ gpconfig -s gp_enable_global_deadlock_detector
Values on all segments are consistent
GUC          : gp_enable_global_deadlock_detector
Master  value: off
Segment value: off

注意:

如果修改参数的时候错误 :

GUC Validation Failed: xxxxxx cannot be changed under normal conditions

则需要在最后加上 --skipvalidation 选项。

需要重新加载才生效:

[gpadmin@gpnode1 ~]$ gpstop -u
20230529:18:11:28:005656 gpstop:gpnode1:gpadmin-[INFO]:-Starting gpstop with args: -u
20230529:18:11:28:005656 gpstop:gpnode1:gpadmin-[INFO]:-Gathering information and validating the environment...
20230529:18:11:28:005656 gpstop:gpnode1:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20230529:18:11:28:005656 gpstop:gpnode1:gpadmin-[INFO]:-Obtaining Segment details from master...
20230529:18:11:28:005656 gpstop:gpnode1:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 6.24.3 build commit:25d3498a400ca5230e81abb94861f23389315213 Open Source'
20230529:18:11:28:005656 gpstop:gpnode1:gpadmin-[INFO]:-Signalling all postmaster processes to reload
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:---------------------------------------------
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:-Some segment postmasters were not reloaded
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:---------------------------------------------
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:-   Host        Datadir               Port   Status
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:-   gpnode1   /data/mirror/gpseg3   7001   d
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:-   gpnode1   /data/mirror/gpseg4   7000   d
20230529:18:11:29:005656 gpstop:gpnode1:gpadmin-[INFO]:---------------------------------------------

[gpadmin@gpnode1 ~]$ gpconfig -s log_statement
Values on all segments are consistent
GUC          : log_statement
Master  value: none
Segment value: none
[gpadmin@gpnode1 ~]$ gpconfig -s log_truncate_on_rotation
Values on all segments are consistent
GUC          : log_truncate_on_rotation
Master  value: on
Segment value: on
[gpadmin@gpnode1 ~]$ gpconfig -s gp_enable_global_deadlock_detector
Values on all segments are consistent
GUC          : gp_enable_global_deadlock_detector
Master  value: off
Segment value: off
[gpadmin@gpnode1 ~]$ gpconfig -s work_mem
Values on all segments are consistent
GUC          : work_mem
Master  value: 512MB
Segment value: 512MB

可以看到 gp_enable_global_deadlock_detector 没有生效,此时需要重启才能生效:

[gpadmin@gpnode1 ~]$ gpstop -M fast
[gpadmin@gpnode1 ~]$ gpstart
[gpadmin@gpnode1 ~]$ gpconfig -s gp_enable_global_deadlock_detector
Values on all segments are consistent
GUC          : gp_enable_global_deadlock_detector
Master  value: on
Segment value: on
Setting a Master Configuration Parameter

在 Greenplum 数据库的 Master 实例上设置 Master 参数。如果是会话参数,可以为特定的数据库,角色或者会话设置参数。如果在多个级别均进行了设置,则范围小的覆盖范围大的,例如会话覆盖角色,角色覆盖数据库,数据库覆盖系统。

Setting Parameters at the System Level

在 Master 的 postgresql.conf 文件中的参数设置为系统级的默认设置。设置步骤如下:

  1. 编辑 $MASTER_DATA_DIRECTORY/postgresql.conf 文件。
  2. 找到要设置的参数,取消注释(移除前面的 # 号),设置值。
  3. 保存并关闭文件。
  4. 对于不需要重启即可生效的会话参数,使用以下命令加载修改的参数。
$ gpstop -u
  1. 对于需要重启才能生效的参数,重启 Greenplum 数据库。
$ gpstop -r

例子:修改参数 log_filename,日志最多保留 1 天。

[gpadmin@gpnode1 ~]$ gpconfig -s log_filename
Values on all segments are consistent
GUC          : log_filename
Master  value: gpdb-%Y-%m-%d_%H%M%S.csv
Segment value: gpdb-%Y-%m-%d_%H%M%S.csv
[gpadmin@gpnode1 ~]$ vi $MASTER_DATA_DIRECTORY/postgresql.conf
log_filename = 'gpdb-%H.csv'
[gpadmin@gpnode1 ~]$ gpstop -u
[gpadmin@gpnode1 ~]$ gpconfig -s log_filename
Values on all segments are consistent
GUC          : log_filename
Master  value: gpdb-%H.csv
Segment value: gpdb-%Y-%m-%d_%H%M%S.csv

Setting Parameters at the Database Level

使用 ALTER DATABASE 在数据库级别设置参数。

=# ALTER DATABASE mydatabase SET search_path TO myschema;

在数据库级别设置会话参数时,连接到该数据库的每个会话都使用该参数设置。数据库级别的设置将覆盖系统级别的设置。

Setting Parameters at the Role Level

使用 ALTER ROLE 在角色级别设置参数。

=# ALTER ROLE bob SET search_path TO bobschema;

在角色级别设置会话参数时,该角色启动的每个会话都使用该参数设置。角色级别的设置将覆盖数据库级别的设置。

Setting Parameters in a Session

使用 SET 命令在活动数据库会话中设置会话参数。

=# SET statement_mem TO '200MB';

使用 RESET 命令重置会话参数。

=# RESET statement_mem;

会话级别的设置将覆盖角色级别的设置。

Viewing Server Configuration Parameter Settings

使用以下 SQL 命令查看 Master 实例参数设置:

[gpadmin@gpnode1 ~]$ psql -c 'SHOW ALL;'
[gpadmin@gpnode1 ~]$ psql -c 'SHOW ALL;' | grep wal_segment_size
 wal_segment_size   | 64MB   | Shows the number of pages per write ahead log segment.
 [gpadmin@bdatnode1 ~]$ psql -c 'SHOW ALL;' | grep gp_vmem_protect_limit
 gp_vmem_protect_limit | 8192  | Virtual memory limit (in MB) of Greenplum memory protection.

使用 gpconfig 查看整个系统,包括 Master 和 Segment 实例的参数:

[gpadmin@gpnode1 ~]$ gpconfig --show wal_segment_size
Values on all segments are consistent
GUC          : wal_segment_size
Master  value: 64MB
Segment value: 64MB
[gpadmin@gpnode1 ~]$ gpconfig --show max_connections
Values on all segments are consistent
GUC          : max_connections
Master  value: 250
Segment value: 750
[gpadmin@bdatnode1 ~]$ gpconfig -s gp_vmem_protect_limit
Values on all segments are consistent
GUC          : gp_vmem_protect_limit
Master  value: 8192
Segment value: 8192

TroubleShooting

Clean pg_xlog

Greenplum 默认是关闭归档的:

[gpadmin@gpnode1 ~]$ gpconfig -s archive_mode
Values on all segments are consistent
GUC          : archive_mode
Master  value: off
Segment value: off

但如果发起了一个超大事务,导致日志不能自动删除,占满了磁盘,则需要手动清理,步骤如下:

[gpadmin@gpnode1 ~]$ gpstate
[gpadmin@gpnode1 ~]$ gpstop -M fast

[gpadmin@gpnode2 ~]$ pg_controldata /data/primary/gpseg3 | egrep 'NextXID|NextOID'
Latest checkpoint's NextXID:          0/2828
Latest checkpoint's NextOID:          16619

# -o 参数为查询到的 NextOID,-x 参数为查询到的 NextXID,-f 参数为对应的数据存储目录
[gpadmin@gpnode2 ~]$ pg_resetxlog -o 16619 -x 2828 -f /data/primary/gpseg3

[gpadmin@gpnode3 ~]$ pg_controldata /data/primary/gpseg4 | egrep 'NextXID|NextOID'
Latest checkpoint's NextXID:          0/7306
Latest checkpoint's NextOID:          16619

[gpadmin@gpnode3 ~]$ pg_resetxlog -o 16619 -x 7306 -f /data/primary/gpseg4

#可用使用下面一条命令解决,注意修改目录变量。
[gpadmin@gpnode3 ~]$ export SEGMENG_DIR=/data/primary/gpseg0; pg_resetxlog -o `pg_controldata $SEGMENG_DIR | egrep NextOID | awk '{print $4}'` -x `pg_controldata $SEGMENG_DIR | egrep NextXID | awk '{print $4}' | awk -F / '{print $2}'` -f $SEGMENG_DIR

[gpadmin@gpnode1 ~]$ gpstart

避免这种情况,还是需要应用或者 ETL 工具限制事务大小。

上次编辑于:
贡献者: stonebox