网站外链平台的建设方法平台类型(至少5个)?,南宁微信公众号开发,怎么做网站搜索框搜索,ninety ajax wordpress作者#xff1a;张乾 外星人2号#xff0c;现兼任六位喵星人的资深铲屎官。 本文来源#xff1a;原创投稿 *爱可生开源社区出品#xff0c;原创内容未经授权不得随意使用#xff0c;转载请联系小编并注明来源。 手滑误删了数据文件#xff0c;并且没有可替换的节点时张乾 外星人2号现兼任六位喵星人的资深铲屎官。 本文来源原创投稿 *爱可生开源社区出品原创内容未经授权不得随意使用转载请联系小编并注明来源。 手滑误删了数据文件并且没有可替换的节点时先别急着提桶跑路可以考虑利用参数 server_permanent_offline_time 来重建受影响的节点。
原理
server_permanent_offline_time 是 OceanBase 数据库中用于控制节点永久下线时长的参数。当集群中的某个节点宕机后系统会根据该参数的设置值来进行相应操作。
如果节点宕机时间小于该参数设置的值系统会暂时不做处理以避免频繁的数据迁移如果宕机时间超过该参数设置的值该节点被标记为永久下线RootService 会将该 OBServer 上包含的数据副本从 Paxos 成员组中删除并在同 zone 内其他可用 OBServer 上补充数据以保证数据副本 Paxos 成员组完整。该参数默认值是 3600 秒一般设置较大以避免不必要的副本复制。此外当永久下线的节点重新被拉起后其上的全部数据都需要从其他副本重新拉取。
在本场景下即是通过调低该参数让故障节点快速永久下线再重新上线达到数据重建的目的。
请注意此过程会占用集群一定的资源可能会影响性能因此建议在业务低峰期进行。
官方建议
关于 server_permanent_offline_time 的适用场景和建议值官方提供如下:
OceanBase 数据库版本升级场景建议将该配置项的值设置为72h。OBServer 硬件更换场景建议将该配置项的值设置为4h。OBServer 清空上线场景建议将该配置项的值设置为10m使集群快速上线。
准备过程
预备一套环境
使用OBD工具快速部署一套3节点OB以及一个OBProxy再创建好一个租户sysbench_tenantprimary_zone为RANDOM。
注本文基于OB 3.1.2版本其他版本需注意另作验证。
准备些数据
使用 sysbench 创建一个表 sbtest1 并插入1W数据。
sysbench ./oltp_insert.lua --mysql-host10.186.60.3 --mysql-port2883 --mysql-dbsysbenchdb --mysql-usersysbenchsysbench_tenant --mysql-passwordsysbench --tables1 --table_size10000 --threads1 --time600 --report-interval10 --db-drivermysql --db-ps-modedisable --skip-trxon --mysql-ignore-errors6002,6004,4012,2013,4016,1062,5157,4038 prepare
这里改写了 sysbench 的建表语句分了3个区查询 sbtest1 表分区副本分布如下
MySQL [oceanbase] select tenant.tenant_name, zone, svr_ip,svr_port, case when role1 then leader when role2 then follower else NULL end as role, count(1) as partition_cnt from __all_virtual_meta_table meta inner join __all_tenant tenant on meta.tenant_idtenant.tenant_id inner join __all_virtual_table tab on meta.tenant_idtab.tenant_id and meta.table_idtab.table_id where tenant.tenant_id1001 and tab.table_namesbtest1 group by tenant.tenant_name,zone, svr_ip,svr_port, 5 order by tenant.tenant_name, zone, svr_ip, role desc;
-------------------------------------------------------------------------
| tenant_name | zone | svr_ip | svr_port | role | partition_cnt |
-------------------------------------------------------------------------
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | leader | 1 |
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | follower | 2 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | leader | 1 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | follower | 2 |
| sysbench_tenant | zone3 | 10.186.64.79 | 2882 | leader | 1 |
| sysbench_tenant | zone3 | 10.186.64.79 | 2882 | follower | 2 |
-------------------------------------------------------------------------开始实验
使用 sysbench 持续写入数据维持一定的流量便于在节点重建后对比各节点数据是否一致。
sysbench ./oltp_insert.lua --mysql-host10.186.60.3 --mysql-port2883 --mysql-dbsysbenchdb --mysql-usersysbenchsysbench_tenant --mysql-passwordsysbench --tables1 --table_size10000 --threads1 --time300 --report-interval10 --db-drivermysql --db-ps-modedisable --skip-trxon --mysql-ignore-errors6002,6004,4012,2013,4016,1062,5157,4038 run删除某节点的数据文件
选择zone3下的10.186.64.79节点将数据文件删除。
[rootlocalhost data]# rm -rf 1/sstable/block_file
[rootlocalhost data]# cd 1/sstable/
[rootlocalhost sstable]# ll
total 0永久下线故障节点
1.调小参数 server_permanent_offline_time 缩短节点永久下线时间
server_permanent_offline_time 默认值为3600s
MySQL [oceanbase] alter system set server_permanent_offline_time60s;
Query OK, 0 rows affected (0.030 sec)MySQL [oceanbase] SHOW PARAMETERS LIKE %server_permanent_offline_time%;
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| zone3 | observer | 10.186.64.79 | 2882 | server_permanent_offline_time | NULL | 60s | the time interval between any two heartbeats beyond which a server is considered to be \permanently\ offline. Range: [20s,∞) | ROOT_SERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| zone1 | observer | 10.186.64.74 | 2882 | server_permanent_offline_time | NULL | 60s | the time interval between any two heartbeats beyond which a server is considered to be \permanently\ offline. Range: [20s,∞) | ROOT_SERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| zone2 | observer | 10.186.64.75 | 2882 | server_permanent_offline_time | NULL | 60s | the time interval between any two heartbeats beyond which a server is considered to be \permanently\ offline. Range: [20s,∞) | ROOT_SERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------2.停止故障节点对外提供服务
在kill ob进程前建议使用隔离(ISOLATE SERVER)或者停止(STOP SERVER)节点的命令停掉发往该节点的请求转移副本leader角色。在节点重建恢复后再开启流量。
# 停掉79节点服务
MySQL [oceanbase] ALTER SYSTEM STOP SERVER 10.186.64.79:2882 ZONEzone3;# 或者隔离
ALTER SYSTEM ISOLATE SERVER 10.186.64.79:2882 ZONEzone3;3.kill observer进程
执行kill -9 $observer_pid等待 server_permanent_offline_time 的时间该ob进入永久下线”状态。判断ob是否已经永久下线可以查询表 __all_rootservice_event_history存在名为 permanent_offline 的event记录确认时间和ip都一致后即可认为ob已经永久下线。
MySQL [oceanbase] select * from __all_rootservice_event_history where eventpermanent_offline ; ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| gmt_create | module | event | name1 | value1 | name2 | value2 | name3 | value3 | name4 | value4 | name5 | value5 | name6 | value6 | extra_info | rs_svr_ip | rs_svr_port |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 2023-03-29 17:34:09.596035 | server | permanent_offline | server | 10.186.64.79:2882 | | | | | | | | | | | | 10.186.64.74 | 2882 |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------查询分区副本分布如下已不存在79节点的分区副本信息进一步确认了79节点已永久下线。
zone2下的75节点有一个从副本升级为leader角色此时集群仍然可以继续对外服务。
MySQL [oceanbase] select tenant.tenant_name, zone, svr_ip,svr_port, case when role1 then leader when role2 then follower else NULL end as role, count(1) as partition_cnt from __all_virtual_meta_table meta inner join __all_tenant tenant on meta.tenant_idtenant.tenant_id inner join __all_virtual_table tab on meta.tenant_idtab.tenant_id and meta.table_idtab.table_id where tenant.tenant_id1001 and tab.table_namesbtest1 group by tenant.tenant_name,zone, svr_ip,svr_port, 5 order by tenant.tenant_name, zone, svr_ip, role desc;
-------------------------------------------------------------------------
| tenant_name | zone | svr_ip | svr_port | role | partition_cnt |
-------------------------------------------------------------------------
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | leader | 1 |
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | follower | 2 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | leader | 2 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | follower | 1 |
-------------------------------------------------------------------------
4 rows in set (0.005 sec)拉起故障节点触发数据自动重建
1.启动79节点的ob进程进程启动后会自动触发重建。
注防止ob启动失败或存在其他问题建议启动前将数据文件和事务日志均清空。 [rootlocalhost data]# rm -rf log1/clog/*
[rootlocalhost data]# rm -rf log1/ilog/*
[rootlocalhost data]# rm -rf log1/slog/*
[rootlocalhost data]# rm -rf 1/sstable/block_file
[rootlocalhost data]# cd 1/sstable/
[rootlocalhost sstable]# ll
total 0
[rootlocalhost sstable]# su admin
bash-4.2$ cd /home/admin/ ./bin/observer
./bin/observer进程启动后确认ob心跳恢复状态为active然后查看分区正在不断补足中
MySQL [oceanbase] select svr_ip,zone,with_rootserver,status,stop_time,start_service_time,build_version from __all_server;
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| svr_ip | zone | with_rootserver | status | stop_time | start_service_time | build_version |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 10.186.64.74 | zone1 | 1 | active | 0 | 1679984798650860 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |
| 10.186.64.75 | zone2 | 0 | active | 0 | 1679984801289281 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |
| 10.186.64.79 | zone3 | 0 | active | 1680082329964975 | 1680082511964975 | 3.1.2_10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d(Dec 30 2021 02:47:29) |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
3 rows in set (0.002 sec)MySQL [oceanbase] select count(*),zone from gv$partition group by zone;
-----------------
| count(*) | zone |
-----------------
| 1322 | zone1 |
| 1322 | zone2 |
| 152 | zone3 |
-----------------
3 rows in set (0.228 sec)MySQL [oceanbase] select count(*),zone from gv$partition group by zone;
-----------------
| count(*) | zone |
-----------------
| 1322 | zone1 |
| 1322 | zone2 |
| 664 | zone3 |
-----------------
3 rows in set (0.113 sec)
MySQL [oceanbase] select count(*),zone from gv$partition group by zone;
-----------------
| count(*) | zone |
-----------------
| 1322 | zone1 |
| 1322 | zone2 |
| 1179 | zone3 |
-----------------
3 rows in set (0.112 sec)MySQL [oceanbase] select count(*),zone from gv$partition group by zone;
-----------------
| count(*) | zone |
-----------------
| 1322 | zone1 |
| 1322 | zone2 |
| 1322 | zone3 |
-----------------
3 rows in set (0.116 sec)当3个zone内的分区个数一致后同时查看zone3已存在副本信息认为重建完毕。
由于79节点处于隔离状态所以还没有leader副本。
MySQL [oceanbase] select tenant.tenant_name, zone, svr_ip,svr_port, case when role1 then leader when role2 then follower else NULL end as role, count(1) as partition_cnt from __all_virtual_meta_table meta inner join __all_tenant tenant on meta.tenant_idtenant.tenant_id inner join __all_virtual_table tab on meta.tenant_idtab.tenant_id and meta.table_idtab.table_id where tenant.tenant_id1001 and tab.table_namesbtest1 group by tenant.tenant_name,zone, svr_ip,svr_port, 5 order by tenant.tenant_name, zone, svr_ip, role desc;
-------------------------------------------------------------------------
| tenant_name | zone | svr_ip | svr_port | role | partition_cnt |
-------------------------------------------------------------------------
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | leader | 1 |
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | follower | 2 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | leader | 2 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | follower | 1 |
| sysbench_tenant | zone3 | 10.186.64.79 | 2882 | follower | 3 |
-------------------------------------------------------------------------
6 rows in set (0.005 sec)2.开启故障节点服务
执行命令解除79节点的隔离状态。
ALTER SYSTEM START SERVER 10.186.64.79:2882 ZONEzone3;查询分区副本分布如下leader角色已迁回79节点。
MySQL [oceanbase] select tenant.tenant_name, zone, svr_ip,svr_port, case when role1 then leader when role2 then follower else NULL end as role, count(1) as partition_cnt from __all_virtual_meta_table meta inner join __all_tenant tenant on meta.tenant_idtenant.tenant_id inner join __all_virtual_table tab on meta.tenant_idtab.tenant_id and meta.table_idtab.table_id where tenant.tenant_id1001 and tab.table_namesbtest1 group by tenant.tenant_name,zone, svr_ip,svr_port, 5 order by tenant.tenant_name, zone, svr_ip, role desc;
-------------------------------------------------------------------------
| tenant_name | zone | svr_ip | svr_port | role | partition_cnt |
-------------------------------------------------------------------------
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | leader | 1 |
| sysbench_tenant | zone1 | 10.186.64.74 | 2882 | follower | 2 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | leader | 1 |
| sysbench_tenant | zone2 | 10.186.64.75 | 2882 | follower | 2 |
| sysbench_tenant | zone3 | 10.186.64.79 | 2882 | leader | 1 |
| sysbench_tenant | zone3 | 10.186.64.79 | 2882 | follower | 2 |
-------------------------------------------------------------------------3.把server_permanent_offline_time参数的预知重新设置为默认的3600s
MySQL [oceanbase] alter system set server_permanent_offline_time3600s;
Query OK, 0 rows affected (0.028 sec)MySQL [oceanbase] SHOW PARAMETERS LIKE %server_permanent_offline_time%;
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| zone | svr_type | svr_ip | svr_port | name | data_type | value | info | section | scope | source | edit_level |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| zone2 | observer | 10.186.64.75 | 2882 | server_permanent_offline_time | NULL | 3600s | the time interval between any two heartbeats beyond which a server is considered to be \permanently\ offline. Range: [20s,∞) | ROOT_SERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| zone1 | observer | 10.186.64.74 | 2882 | server_permanent_offline_time | NULL | 3600s | the time interval between any two heartbeats beyond which a server is considered to be \permanently\ offline. Range: [20s,∞) | ROOT_SERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
| zone3 | observer | 10.186.64.79 | 2882 | server_permanent_offline_time | NULL | 3600s | the time interval between any two heartbeats beyond which a server is considered to be \permanently\ offline. Range: [20s,∞) | ROOT_SERVICE | CLUSTER | DEFAULT | DYNAMIC_EFFECTIVE |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3 rows in set (0.007 sec)校验各ob节点数据量
sysbench已运行结束直连各observer校验数据量是一致的。
[rootlocalhost ~]# obclient -h10.186.64.74 -P2881 -usysbenchsysbench_tenant -Dsysbenchdb -A -psysbench
Welcome to the OceanBase. Commands end with ; or \g.
Your MySQL connection id is 3221545401
Server version: 5.7.25 OceanBase 3.1.2 (r10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d) (Built Dec 30 2021 02:47:29)Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.Type help; or \h for help. Type \c to clear the current input statement.MySQL [sysbenchdb] select count(*) from sbtest1;
----------
| count(*) |
----------
| 53195 |
----------
1 row in set (0.036 sec)MySQL [sysbenchdb] exit
Bye
[rootlocalhost ~]# obclient -h10.186.64.75 -P2881 -usysbenchsysbench_tenant -Dsysbenchdb -A -psysbench
Welcome to the OceanBase. Commands end with ; or \g.
Your MySQL connection id is 3221823448
Server version: 5.7.25 OceanBase 3.1.2 (r10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d) (Built Dec 30 2021 02:47:29)Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.Type help; or \h for help. Type \c to clear the current input statement.MySQL [sysbenchdb] select count(*) from sbtest1;
----------
| count(*) |
----------
| 53195 |
----------
1 row in set (0.040 sec)MySQL [sysbenchdb] exit
Bye
[rootlocalhost ~]# obclient -h10.186.64.79 -P2881 -usysbenchsysbench_tenant -Dsysbenchdb -A -psysbench
Welcome to the OceanBase. Commands end with ; or \g.
Your MySQL connection id is 3222011907
Server version: 5.7.25 OceanBase 3.1.2 (r10000392021123010-d4ace121deae5b81d8f0b40afbc4c02705b7fc1d) (Built Dec 30 2021 02:47:29)Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.Type help; or \h for help. Type \c to clear the current input statement.MySQL [sysbenchdb] select count(*) from sbtest1;
----------
| count(*) |
----------
| 53195 |
----------
1 row in set (0.037 sec)MySQL [sysbenchdb]总结
数据文件损坏或者丢失时可通过调整参数 server_permanent_offline_time 来重建受影响的节点。
1.设小 server_permanent_offline_time 阈值
2.停止故障节点对外服务
3.终止该节点进程。
4.超过阈值后节点将被标记为永久下线系统会自动清空副本以及向同zone内其他节点迁移数据。
5.启动 OB 进程自动触发重建节点数据。
6.开启故障节点服务。
7.把server_permanent_offline_time参数改回原来的值