# mogdb-monitor **Repository Path**: enmotech/mogdb-monitor ## Basic Information - **Project Name**: mogdb-monitor - **Description**: MogDB Monitor - **Primary Language**: Unknown - **License**: MulanPSL-2.0 - **Default Branch**: master - **Homepage**: https://www.mogdb.io - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 3 - **Created**: 2022-01-07 - **Last Updated**: 2024-11-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # MogDB集群监控部署 本文档适用于手动部署监控报警系统的用户。 |部署位置|端口|用途| |:-------|:----|:----- | |监控服务器|9001|prometheus| |监控服务器|9002|grafana| |监控服务器|9003|AlertManager| |监控服务器|9004|SNMP| |数据库服务器|9100|node_exporter| |数据库服务器|9187|og_exporter| 下载这个项目. ```bash git clone https://gitee.com/enmotech/mogdb-monitor.git cd mogdb-monitor ``` 部署监控服务分为两种方式. - [Docker-Compose](#docker-compose) - [手工部署](#manual-installation) - [expoter](#exporter) - [node_exporter](#node_exporter) - [og_exporter](#og_exporter) ## Docker-Compose - 启动服务 ```bash docker-compose up -d ``` - 查看状态 ``` Name Command State Ports ---------------------------------------------------------------------------------------------------------- alertmanager /bin/alertmanager --config ... Up 0.0.0.0:9003->9093/tcp,:::9003->9093/tcp grafana /run.sh Up 0.0.0.0:9002->3000/tcp,:::9002->3000/tcp prometheus /bin/prometheus --config.f ... Up 0.0.0.0:9001->9090/tcp,:::9001->9090/tcp ``` - 部署 node_exporter 参考[node_exporter](#node_exporter) - 部署 og_exporter 参考[node_exporter](#og_exporter) > 配置文件自动[file_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config)发现不需要每次都修改 `prometheus.yml` 文件 - prometheus添加scrape 在 `prometheus/target/[os|db]` 目录下 创建文件即可. 如 ```bash # 增加主机监控放在os目录下 # 增加数据库监控放在db目录下 vi prometheus/target/os/os_192.168.1.100.yml - targets: - '192.168.1.100:9100' labels: env: product job: node_exporter instance: 192.168.1.100 ``` ## Manual installation > 手工安装也需要拉取项目 ### prometheus #### Download prometheus 软件下载地址: 根据平台进行下载,这里选择arm平台 下载安装介质: prometheus-2.31.1.linux-arm64.tar.gz ```bash # arm 平台 wget https://github.com/prometheus/prometheus/releases/download/v2.31.1/prometheus-2.31.1.linux-arm64.tar.gz # amd 平台 wget https://github.com/prometheus/prometheus/releases/download/v2.32.1/prometheus-2.32.1.linux-amd64.tar.gz ``` #### 创建prometheus用户 1. 使用root用户创建prometheus用户 ```bash useradd prometheus ``` 2. 使用root用户创建prometheus相应目录并解压安装包 ```bash mkdir -p /app/prometheus/prometheus mkdir -p /app/prometheus/data mkdir -p /app/prometheus/target tar -zxvf prometheus-2.31.1.linux-arm64.tar.gz --strip-components 1 -C /app/prometheus/prometheus/ # 复制项目里的prometheus配置文件 cp -r prometheus/* /app/prometheus/ chown -R prometheus: /app/prometheus ``` 3. 使用prometheus用户编辑配置文件。 ```bash su - prometheus sed -i "s#/etc/prometheus#/app/prometheus/#g" /app/prometheus/prometheus.yml sed -i "s#alertmanager:9093#127.0.0.1:9003#g" /app/prometheus/prometheus.yml sed -i "s#127.0.0.1:9090#127.0.0.1:9003#g" /app/prometheus/prometheus.yml ``` > 注意需要修改`scrape_configs`中的内容,`prometheus`的`targets`修改为部署机器的ip:port,`node_exporter`的targets同理 4. 使用root用户配置prometheus开机启动,新建prometheus.server文件如下, ```bash vi /usr/lib/systemd/system/prometheus.service ``` ```bash [Unit] Description=Prometheus Service After=network.target [Service] Type=simple User=prometheus ExecStart=/app/prometheus/prometheus/prometheus --web.listen-address=:9001 --config.file=/app/prometheus/prometheus.yml --storage.tsdb.path=/app/prometheus/data ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target ``` > `web.listen-address`为监控服务器本机IP及prometheus使用端口 5. 启动prometheus服务 ```bash systemctl daemon-reload systemctl enable prometheus systemctl start prometheus systemctl status prometheus ``` 6. prometheus服务验证 使用web浏览器验证prometheus服务如下: 打开 如下如所示,说明prometheus服务正常。 ![prometheus](media/12e2733d21005fe6440355562d1f718b.png) ### Grafana #### 安装包下载 软件下载地址: 根据平台进行下载,这里选择arm平台 下载安装介质:grafana-enterprise-7.5.11.linux-arm64.tar.gz ```bash # arm64 wget https://dl.grafana.com/enterprise/release/grafana-enterprise-7.5.11.linux-arm64.tar.gz # amd64 wget https://dl.grafana.com/enterprise/release/grafana-enterprise-7.5.11.linux-amd64.tar.gz ``` #### 安装部署 1. 使用root用户解压安装 ```bash mkdir -p /app/grafana/grafana mkdir -p /app/grafana/data tar -zxvf grafana-enterprise-7.5.11.linux-arm64.tar.gz --strip-components 1 -C /app/grafana/grafana # 复制项目里的grafana配置文件 cp -r grafana/* /app/grafana/ chown -R prometheus: /app/grafana/ ``` 2. 使用prometheus用户检查安装版本 ```bash /app/grafana/grafana/bin/grafana-server -v ``` 回显如下: ```bash Version 7.5.11 (commit: 6f8c1d9fe4, branch: HEAD) ``` 3. 使用prometheus用户配置grafana, ```bash # vi /app/grafana/conf/defaults.ini # 修改grafana端口号为9002 sed -i "s#http_port = 3000#http_port = 9002#g" /app/grafana/conf/grafana.ini # 修改自动注册数据源和dashborad配置 sed -i "s#provisioning = conf/provisioning#provisioning = /app/grafana/provisioning#g" /app/grafana/conf/grafana.ini sed -i "s#http://prometheus:9090#http://127.0.0.1:9001#g" /app/grafana/provisioning/datasources/prometheus.yaml sed -i "s#/usr/share/grafana/dashboards#/app/grafana/dashboard#g" /app/grafana/provisioning/dashboards/mogdb.yaml ``` 4. 使用root用户配置服务开机启动,新建grafana.service文件如下: ```bash vi /usr/lib/systemd/system/grafana.service ``` ```bash [Unit] Description=Grafana Service [Service] User=prometheus ExecStart=/app/grafana/grafana/bin/grafana-server -homepath /app/grafana/grafana/ -config /app/grafana/conf/grafana.ini [Install] WantedBy=multi-user.target ``` 5. 使用root用户启动grafana服务并确认服务状态 ```bash systemctl daemon-reload systemctl enable grafana.service systemctl start grafana.service systemctl status grafana.service ``` 6. grafana服务验证 使用web浏览器验证grafana服务 打开 如下所示,说明grafana服务正常。 ![grafana.png](media/c24933ecfde17e32ca4cc905ab3b8371.png) 账号:admin 默认密码:Mogdb@123 模版和数据源已自动导入 ### AltertManager #### 安装包下载 软件下载地址: 根据平台进行下载,这里选择arm平台 下载安装介质:alertmanager-0.23.0.linux-arm64.tar.gz ```bash wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-arm64.tar.gz ``` #### 安装部署 1. 使用root用户解压安装包至/appdata/prometheus并授权 ```bash mkdir -p /app/alertmanager/ mkdir -p /app/alertmanager/data/ tar -zxvf alertmanager-0.23.0.linux-arm64.tar.gz --strip-components 1 -C /app/alertmanager/ # 复制项目里的alertmanager配置文件 cp alertmanager/* /app/alertmanager/ chown -R prometheus: /app/alertmanager/ ``` 2. 使用prometheus用户配置alertmanager.yml文件,如下`webhook_configs`中`url`为监控服务器地址及SNMP使用端口号 ```bash vi /app/alertmanager/alertmanager.yml ``` ```yaml route: group_by: ['...'] group_wait: 10s group_interval: 30s repeat_interval: 1m receiver: 'snmp_notifier' routes: # database alert group by instance, server - receiver: 'snmp_notifier' group_by: [instance, server] group_wait: 10s matchers: - service=~"MogDB" receivers: - name: 'snmp_notifier' webhook_configs: - url: http://127.0.0.1:9004/alerts inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance'] ``` > 这里增加的是snmp_notifier告警通知. 具体更具情况进行配置. 配置文件说明[alertmanager](https://prometheus.io/docs/alerting/latest/configuration/) 3. 使用root用户配置服务开机启动,新建alertmanager.service文件如下,如下`web.listen-address`为监控服务器地址及AlterManager使用端口号 ```bash vi /usr/lib/systemd/system/alertmanager.service ``` ```bash [Unit] Description=Prometheus Alert Manager After=network.target [Service] Type=simple User=prometheus ExecStart=/app/alertmanager/alertmanager --web.listen-address=:9003 --storage.path=/app/alertmanager/data --config.file=/app/alertmanager/alertmanager.yml ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target ``` 4. 启动alertmanager服务并确认状态 ```bash systemctl daemon-reload systemctl enable alertmanager systemctl start alertmanager systemctl status alertmanager ``` 5. prometheus配置alert manager 默认配置文件已配置. 可忽略 prometheus配置文件prometheus.yml修改如下内容,修改地址及端口号为监控服务器地址及alertmanager端口号,并打开告警规则文件rules/*.yml(告警规则文件包括mogdb_rules.yml和node_rules.yml需要上传至实际存放路径:/appdata/prometheus/prometheus-2.31.1.linux-arm64/etc/rules/),详见告警规则及文件。 ```bash vi /app/prometheus/prometheus.yml ``` ```yaml alerting: alertmanagers: - static_configs: - targets: - "127.0.0.1:9003" ``` prometheus配置文件prometheus.yml同时增加如下内容,红色字体为监控服务器地址及alertmanager使用端口号 ```yaml - job_name: 'alertmanager ' static_configs: - targets: ['127.0.0.1:9003'] ``` 如下图所示: ![img](media/f24969877db79dd0acf5288baa392d2c.png) 6. 使用root用户重启prometheus服务并确认服务状态 ```bash systemctl restart prometheus systemctl status prometheus ``` 服务部署之后可在 targets 中观察是否成功 ![截图](media/e263109af2d1f8f0fc5037c9e4bb42f6.png) ### SNMP部署(可选) #### 安装包下载 软件下载地址: 根据平台进行下载,这里选择arm平台 下载安装介质:snmp_notifier-1.2.1.linux-arm64.tar.gz ```bash wget https://github.com/maxwo/snmp_notifier/releases/download/v1.2.1/snmp_notifier-1.2.1.linux-arm64.tar.gz ``` #### 安装部署 1. 使用root用户登录监控服务器,解压安装包至`/app/snmp_notifier`并授权 ```bash mkdir -p /app/snmp_notifier/ tar -zxvf snmp_notifier-1.2.1.linux-arm64.tar.gz --strip-components 1 -C /app/snmp_notifier/ ``` 2. 使用root用户配置服务开机启动,新建snmp_notifier.service文件,如下`web.listen-address`为监控服务器地址及SNMP使用端口号 ```bash vi /usr/lib/systemd/system/snmp_notifier.service ``` ```bash [Unit] Description=Prometheus SNMP Notifier Service After=network.target [Service] Type=simple User=prometheus ExecStart=/app/snmp_notifier/snmp_notifier --web.listen-address=":9004" --snmp.trap-description-template /app/snmp_notifier/description-template.tpl --snmp.destination=50.1.2.67:162 ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target ``` > - `--snmp.destination` 接受地址 > - `--snmp.extra-field-template=4=/app/snmp_notifier/4_extra-field-template.tpl` 可增加snmp发送字段 如 > `echo "{{ len .Alerts }} alerts are firing." > /app/snmp_notifier/4_extra-field-template.tpl` > > `--snmp.extra-field-template=4=/app/snmp_notifier/4_extra-field-template.tpl` > > ```bash > Agent Address: 0.0.0.0 > Agent Hostname: localhost > Date: 1 - 0 - 0 - 1 - 1 - 1970 > Enterprise OID: . > Trap Type: Cold Start > Trap Sub-Type: 0 > Community/Infosec Context: TRAP2, SNMP v2c, community public > Uptime: 0 > Description: Cold Start > PDU Attribute/Value Pair Array: > .iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (2665700) 7:24:17.00 > .iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private. enterprises.98789.0.1 > .iso.org.dod.internet.private.enterprises.98789.0.1.1 = STRING: "1.3.6.1.4.1.98789.0.1[environment=production,label=test]" > .iso.org.dod.internet.private.enterprises.98789.0.1.2 = STRING: "critical" > .iso.org.dod.internet.private.enterprises.98789.0.1.3 = STRING: "Status: critical > - Alert: TestAlert > Summary: This is the summary > Description: This is the description on job1 > Status: warning > - Alert: TestAlert > Summary: This is the random summary > Description: This is the description of alert 1" > .iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "2 alerts are firing." > ``` > > `.iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "2 alerts are firing."` 新增数据 3. 启动SNMP服务并确认状态 ```bash systemctl daemon-reload systemctl enable snmp_notifier systemctl start snmp_notifier systemctl status snmp_notifier ``` 4. prometheus配置SNMP Notifier prometheus配置文件`/app/prometheus/prometheus.yml`增加如下内容 ```yaml - job_name: 'snmp_notifier' static_configs: - targets: ['110.128.131.16:9004'] ``` 5. 重启prometheus服务 使用root用户重启prometheus服务 ```bash systemctl restart prometheus ``` ## exporter ### node_exporter #### 安装包下载 软件下载地址: 根据平台进行下载,这里选择arm平台 下载安装介质:node_exporter-1.2.2.linux-arm64.tar.gz ```bash wget https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-arm64.tar.gz ``` #### 安装部署 > 注:node_exporter主备数据库服务器及监控服务器本机均需要安装 1. 使用root用户解压安装包至/app/app/node_exporter目录 ```bash mkdir -p /app/node_exporter tar -zxvf node_exporter-1.2.2.linux-arm64.tar.gz --strip-components 1 -C /app/node_exporter ``` 2. 使用root用户配置node_exporter开机启动,新建node_exporter.service文件如下, ```bash vi /usr/lib/systemd/system/node_exporter.service ``` ```bash [Unit] Description=Prometheus Node Exporter Service After=network.target [Service] Type=simple User=root ExecStart=/app/node_exporter/node_exporter --web.listen-address=:9100 --no-collector.softnet ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target ``` > `web.listen-address`为监控服务器本机IP及node_exporter使用端口 3. 启动node_exporter服务 ```bash systemctl daemon-reload systemctl enable node_exporter systemctl start node_exporter ``` 4. 查看node_exporter服务 ```bash systemctl status node_exporter curl http://localhost:9100/metrics ``` 5. prometheus添加采集 ``` vi /app/prometheus/target/os/os_.yml ``` ```yml - targets: - '127.0.0.1:9100' labels: env: product job: prometheus instance: 127.0.0.1 ``` ### og_exporter #### 安装包下载 软件下载地址: 根据平台进行下载,这里选择arm平台 下载安装介质: opengauss_exporter_1.0.0_linux_arm64.zip ```bash wget https://gitee.com/opengauss/openGauss-prometheus-exporter/attach_files/973905/download/opengauss_exporter_1.0.0_linux_arm64.zip ``` > og_exporter 可以选择部署在数据库服务器或者其他服务器,只要og_exporter可以连接数据即可 #### 安装部署 1. 使用root用户解压安装包至/app/promethues/opengauss_exporter目录并授权 ```bash mkdir -p /app/opengauss_exporter unzip opengauss_exporter_1.0.0_linux_arm64.zip -d /app/opengauss_exporter ``` 2. 上传 `og_expoter/queries.yaml`文件,放到`/app/opengauss_exporter/` ```bash cp og_expoter/queries.yaml /app/opengauss_exporter/ ``` #### 本地连接监控 本地连接监控需要exporter运行在数据库操作系统用户下 1. 查看数据库socket路径 ``` postgres=# show unix_socket_directory; unix_socket_directory ----------------------- /tmp (1 row) ``` 2. 配置开启启动 ```bash vi /usr/lib/systemd/system/mogdb_exporter.service ``` ```bash [Unit] Description=Prometheus MogDB Exporter Service [Service] # 启动用户需要和数据库操作系统用户一致 User=omm Environment="DATA_SOURCE_NAME=host=/tmp port=26000 user=omm dbname=postgres" ExecStart=/app/opengauss_exporter/opengauss_exporter --auto-discover-databases --exclude-databases="template0,template1" --web.listen-address=:9187 --config=/app/opengauss_exporter/queries.yaml [Install] WantedBy=multi-user.target ``` 3. 启动并查看dbexporter服务 ```bash systemctl daemon-reload systemctl enable mogdb_exporter.service systemctl start mogdb_exporter.service systemctl status mogdb_exporter.service ``` #### 远程连接监控 1. 被监控的数据库需要创建监控用户,密码复杂度要符合数据库的要求,默认要求大小写+特殊字符,不少于8位 在要监控的数据库创建用户 ```bash gsql -Uomm postgres -r -p 26000 show password_encryption_type; #是否是1, 不是改为1 alter system set password_encryption_type=1; ``` ```bash CREATE USER db_exporter WITH PASSWORD 'Admin@1234' MONADMIN; grant usage on schema dbe_perf to db_exporter; grant select on pg_stat_replication to db_exporter; ``` 2. 配置pg_hba.conf以md5加密方式添加监控服务器白名单,使用omm用户主库执行如下操作, 如果是本机则把ip改为0.0.0.0/0, ```bash gs_guc set -I all -N all -h "host postgres db_exporter 110.128.131.16/32 md5" ``` 3. 在监控服务器端使用root用户配置db_exporter服务开机启动,新建mogdb_exporter.service 文件如下, `Environment`为 IP地址及数据库访问监听端口26000 `web.listen-address`为项目管理系统dbexporter使用端口9187 ```bash vi /usr/lib/systemd/system/mogdb_exporter.service ``` ```bash [Unit] Description=Prometheus MogDB Exporter Service [Service] # 需要操作系统存在prometheus无此用户请自行创建 User=prometheus Environment="DATA_SOURCE_NAME=postgresql://db_exporter:Admin@1234@35.10.3.34:26000/postgres?sslmode=disable" ExecStart=/app/opengauss_exporter/opengauss_exporter --auto-discover-databases --exclude-databases="template0,template1" --web.listen-address=:9187 --config=/app/opengauss_exporter/queries.yaml [Install] WantedBy=multi-user.target ``` > 注:如果是集群则写俩个Environment 4. 启动并查看dbexporter服务 ```bash systemctl daemon-reload systemctl enable mogdb_exporter.service systemctl start mogdb_exporter.service systemctl status mogdb_exporter.service ``` #### Prometheus添加数据库节点 ``` vi /app/prometheus/target/db/db_.yml ``` ```yml - targets: - ':9187' labels: job: mogdb_exporter instance: ```