# mogdb-monitor
**Repository Path**: enmotech/mogdb-monitor
## Basic Information
- **Project Name**: mogdb-monitor
- **Description**: MogDB Monitor
- **Primary Language**: Unknown
- **License**: MulanPSL-2.0
- **Default Branch**: master
- **Homepage**: https://www.mogdb.io
- **GVP Project**: No
## Statistics
- **Stars**: 1
- **Forks**: 3
- **Created**: 2022-01-07
- **Last Updated**: 2024-11-25
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# MogDB集群监控部署
本文档适用于手动部署监控报警系统的用户。
|部署位置|端口|用途|
|:-------|:----|:----- |
|监控服务器|9001|prometheus|
|监控服务器|9002|grafana|
|监控服务器|9003|AlertManager|
|监控服务器|9004|SNMP|
|数据库服务器|9100|node_exporter|
|数据库服务器|9187|og_exporter|
下载这个项目.
```bash
git clone https://gitee.com/enmotech/mogdb-monitor.git
cd mogdb-monitor
```
部署监控服务分为两种方式.
- [Docker-Compose](#docker-compose)
- [手工部署](#manual-installation)
- [expoter](#exporter)
- [node_exporter](#node_exporter)
- [og_exporter](#og_exporter)
## Docker-Compose
- 启动服务
```bash
docker-compose up -d
```
- 查看状态
```
Name Command State Ports
----------------------------------------------------------------------------------------------------------
alertmanager /bin/alertmanager --config ... Up 0.0.0.0:9003->9093/tcp,:::9003->9093/tcp
grafana /run.sh Up 0.0.0.0:9002->3000/tcp,:::9002->3000/tcp
prometheus /bin/prometheus --config.f ... Up 0.0.0.0:9001->9090/tcp,:::9001->9090/tcp
```
- 部署 node_exporter 参考[node_exporter](#node_exporter)
- 部署 og_exporter 参考[node_exporter](#og_exporter)
> 配置文件自动[file_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config)发现不需要每次都修改 `prometheus.yml` 文件
- prometheus添加scrape
在 `prometheus/target/[os|db]` 目录下 创建文件即可. 如
```bash
# 增加主机监控放在os目录下
# 增加数据库监控放在db目录下
vi prometheus/target/os/os_192.168.1.100.yml
- targets:
- '192.168.1.100:9100'
labels:
env: product
job: node_exporter
instance: 192.168.1.100
```
## Manual installation
> 手工安装也需要拉取项目
### prometheus
#### Download prometheus
软件下载地址:
根据平台进行下载,这里选择arm平台
下载安装介质: prometheus-2.31.1.linux-arm64.tar.gz
```bash
# arm 平台
wget https://github.com/prometheus/prometheus/releases/download/v2.31.1/prometheus-2.31.1.linux-arm64.tar.gz
# amd 平台
wget https://github.com/prometheus/prometheus/releases/download/v2.32.1/prometheus-2.32.1.linux-amd64.tar.gz
```
#### 创建prometheus用户
1. 使用root用户创建prometheus用户
```bash
useradd prometheus
```
2. 使用root用户创建prometheus相应目录并解压安装包
```bash
mkdir -p /app/prometheus/prometheus
mkdir -p /app/prometheus/data
mkdir -p /app/prometheus/target
tar -zxvf prometheus-2.31.1.linux-arm64.tar.gz --strip-components 1 -C /app/prometheus/prometheus/
# 复制项目里的prometheus配置文件
cp -r prometheus/* /app/prometheus/
chown -R prometheus: /app/prometheus
```
3. 使用prometheus用户编辑配置文件。
```bash
su - prometheus
sed -i "s#/etc/prometheus#/app/prometheus/#g" /app/prometheus/prometheus.yml
sed -i "s#alertmanager:9093#127.0.0.1:9003#g" /app/prometheus/prometheus.yml
sed -i "s#127.0.0.1:9090#127.0.0.1:9003#g" /app/prometheus/prometheus.yml
```
> 注意需要修改`scrape_configs`中的内容,`prometheus`的`targets`修改为部署机器的ip:port,`node_exporter`的targets同理
4. 使用root用户配置prometheus开机启动,新建prometheus.server文件如下,
```bash
vi /usr/lib/systemd/system/prometheus.service
```
```bash
[Unit]
Description=Prometheus Service
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/app/prometheus/prometheus/prometheus --web.listen-address=:9001 --config.file=/app/prometheus/prometheus.yml --storage.tsdb.path=/app/prometheus/data
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
```
> `web.listen-address`为监控服务器本机IP及prometheus使用端口
5. 启动prometheus服务
```bash
systemctl daemon-reload
systemctl enable prometheus
systemctl start prometheus
systemctl status prometheus
```
6. prometheus服务验证
使用web浏览器验证prometheus服务如下:
打开 如下如所示,说明prometheus服务正常。

### Grafana
#### 安装包下载
软件下载地址:
根据平台进行下载,这里选择arm平台
下载安装介质:grafana-enterprise-7.5.11.linux-arm64.tar.gz
```bash
# arm64
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-7.5.11.linux-arm64.tar.gz
# amd64
wget https://dl.grafana.com/enterprise/release/grafana-enterprise-7.5.11.linux-amd64.tar.gz
```
#### 安装部署
1. 使用root用户解压安装
```bash
mkdir -p /app/grafana/grafana
mkdir -p /app/grafana/data
tar -zxvf grafana-enterprise-7.5.11.linux-arm64.tar.gz --strip-components 1 -C /app/grafana/grafana
# 复制项目里的grafana配置文件
cp -r grafana/* /app/grafana/
chown -R prometheus: /app/grafana/
```
2. 使用prometheus用户检查安装版本
```bash
/app/grafana/grafana/bin/grafana-server -v
```
回显如下:
```bash
Version 7.5.11 (commit: 6f8c1d9fe4, branch: HEAD)
```
3. 使用prometheus用户配置grafana,
```bash
# vi /app/grafana/conf/defaults.ini
# 修改grafana端口号为9002
sed -i "s#http_port = 3000#http_port = 9002#g" /app/grafana/conf/grafana.ini
# 修改自动注册数据源和dashborad配置
sed -i "s#provisioning = conf/provisioning#provisioning = /app/grafana/provisioning#g" /app/grafana/conf/grafana.ini
sed -i "s#http://prometheus:9090#http://127.0.0.1:9001#g" /app/grafana/provisioning/datasources/prometheus.yaml
sed -i "s#/usr/share/grafana/dashboards#/app/grafana/dashboard#g" /app/grafana/provisioning/dashboards/mogdb.yaml
```
4. 使用root用户配置服务开机启动,新建grafana.service文件如下:
```bash
vi /usr/lib/systemd/system/grafana.service
```
```bash
[Unit]
Description=Grafana Service
[Service]
User=prometheus
ExecStart=/app/grafana/grafana/bin/grafana-server -homepath /app/grafana/grafana/ -config /app/grafana/conf/grafana.ini
[Install]
WantedBy=multi-user.target
```
5. 使用root用户启动grafana服务并确认服务状态
```bash
systemctl daemon-reload
systemctl enable grafana.service
systemctl start grafana.service
systemctl status grafana.service
```
6. grafana服务验证
使用web浏览器验证grafana服务
打开 如下所示,说明grafana服务正常。

账号:admin
默认密码:Mogdb@123
模版和数据源已自动导入
### AltertManager
#### 安装包下载
软件下载地址:
根据平台进行下载,这里选择arm平台
下载安装介质:alertmanager-0.23.0.linux-arm64.tar.gz
```bash
wget https://github.com/prometheus/alertmanager/releases/download/v0.23.0/alertmanager-0.23.0.linux-arm64.tar.gz
```
#### 安装部署
1. 使用root用户解压安装包至/appdata/prometheus并授权
```bash
mkdir -p /app/alertmanager/
mkdir -p /app/alertmanager/data/
tar -zxvf alertmanager-0.23.0.linux-arm64.tar.gz --strip-components 1 -C /app/alertmanager/
# 复制项目里的alertmanager配置文件
cp alertmanager/* /app/alertmanager/
chown -R prometheus: /app/alertmanager/
```
2. 使用prometheus用户配置alertmanager.yml文件,如下`webhook_configs`中`url`为监控服务器地址及SNMP使用端口号
```bash
vi /app/alertmanager/alertmanager.yml
```
```yaml
route:
group_by: ['...']
group_wait: 10s
group_interval: 30s
repeat_interval: 1m
receiver: 'snmp_notifier'
routes:
# database alert group by instance, server
- receiver: 'snmp_notifier'
group_by: [instance, server]
group_wait: 10s
matchers:
- service=~"MogDB"
receivers:
- name: 'snmp_notifier'
webhook_configs:
- url: http://127.0.0.1:9004/alerts
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
```
> 这里增加的是snmp_notifier告警通知. 具体更具情况进行配置. 配置文件说明[alertmanager](https://prometheus.io/docs/alerting/latest/configuration/)
3. 使用root用户配置服务开机启动,新建alertmanager.service文件如下,如下`web.listen-address`为监控服务器地址及AlterManager使用端口号
```bash
vi /usr/lib/systemd/system/alertmanager.service
```
```bash
[Unit]
Description=Prometheus Alert Manager
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/app/alertmanager/alertmanager --web.listen-address=:9003 --storage.path=/app/alertmanager/data --config.file=/app/alertmanager/alertmanager.yml
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
```
4. 启动alertmanager服务并确认状态
```bash
systemctl daemon-reload
systemctl enable alertmanager
systemctl start alertmanager
systemctl status alertmanager
```
5. prometheus配置alert manager
默认配置文件已配置. 可忽略
prometheus配置文件prometheus.yml修改如下内容,修改地址及端口号为监控服务器地址及alertmanager端口号,并打开告警规则文件rules/*.yml(告警规则文件包括mogdb_rules.yml和node_rules.yml需要上传至实际存放路径:/appdata/prometheus/prometheus-2.31.1.linux-arm64/etc/rules/),详见告警规则及文件。
```bash
vi /app/prometheus/prometheus.yml
```
```yaml
alerting:
alertmanagers:
- static_configs:
- targets:
- "127.0.0.1:9003"
```
prometheus配置文件prometheus.yml同时增加如下内容,红色字体为监控服务器地址及alertmanager使用端口号
```yaml
- job_name: 'alertmanager '
static_configs:
- targets: ['127.0.0.1:9003']
```
如下图所示:

6. 使用root用户重启prometheus服务并确认服务状态
```bash
systemctl restart prometheus
systemctl status prometheus
```
服务部署之后可在 targets 中观察是否成功

### SNMP部署(可选)
#### 安装包下载
软件下载地址:
根据平台进行下载,这里选择arm平台
下载安装介质:snmp_notifier-1.2.1.linux-arm64.tar.gz
```bash
wget https://github.com/maxwo/snmp_notifier/releases/download/v1.2.1/snmp_notifier-1.2.1.linux-arm64.tar.gz
```
#### 安装部署
1. 使用root用户登录监控服务器,解压安装包至`/app/snmp_notifier`并授权
```bash
mkdir -p /app/snmp_notifier/
tar -zxvf snmp_notifier-1.2.1.linux-arm64.tar.gz --strip-components 1 -C /app/snmp_notifier/
```
2. 使用root用户配置服务开机启动,新建snmp_notifier.service文件,如下`web.listen-address`为监控服务器地址及SNMP使用端口号
```bash
vi /usr/lib/systemd/system/snmp_notifier.service
```
```bash
[Unit]
Description=Prometheus SNMP Notifier Service
After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/app/snmp_notifier/snmp_notifier --web.listen-address=":9004" --snmp.trap-description-template /app/snmp_notifier/description-template.tpl --snmp.destination=50.1.2.67:162
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
```
> - `--snmp.destination` 接受地址
> - `--snmp.extra-field-template=4=/app/snmp_notifier/4_extra-field-template.tpl` 可增加snmp发送字段 如
> `echo "{{ len .Alerts }} alerts are firing." > /app/snmp_notifier/4_extra-field-template.tpl`
>
> `--snmp.extra-field-template=4=/app/snmp_notifier/4_extra-field-template.tpl`
>
> ```bash
> Agent Address: 0.0.0.0
> Agent Hostname: localhost
> Date: 1 - 0 - 0 - 1 - 1 - 1970
> Enterprise OID: .
> Trap Type: Cold Start
> Trap Sub-Type: 0
> Community/Infosec Context: TRAP2, SNMP v2c, community public
> Uptime: 0
> Description: Cold Start
> PDU Attribute/Value Pair Array:
> .iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (2665700) 7:24:17.00
> .iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private. enterprises.98789.0.1
> .iso.org.dod.internet.private.enterprises.98789.0.1.1 = STRING: "1.3.6.1.4.1.98789.0.1[environment=production,label=test]"
> .iso.org.dod.internet.private.enterprises.98789.0.1.2 = STRING: "critical"
> .iso.org.dod.internet.private.enterprises.98789.0.1.3 = STRING: "Status: critical
> - Alert: TestAlert
> Summary: This is the summary
> Description: This is the description on job1
> Status: warning
> - Alert: TestAlert
> Summary: This is the random summary
> Description: This is the description of alert 1"
> .iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "2 alerts are firing."
> ```
>
> `.iso.org.dod.internet.private.enterprises.98789.0.1.4 = STRING: "2 alerts are firing."` 新增数据
3. 启动SNMP服务并确认状态
```bash
systemctl daemon-reload
systemctl enable snmp_notifier
systemctl start snmp_notifier
systemctl status snmp_notifier
```
4. prometheus配置SNMP Notifier
prometheus配置文件`/app/prometheus/prometheus.yml`增加如下内容
```yaml
- job_name: 'snmp_notifier'
static_configs:
- targets: ['110.128.131.16:9004']
```
5. 重启prometheus服务
使用root用户重启prometheus服务
```bash
systemctl restart prometheus
```
## exporter
### node_exporter
#### 安装包下载
软件下载地址:
根据平台进行下载,这里选择arm平台
下载安装介质:node_exporter-1.2.2.linux-arm64.tar.gz
```bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-arm64.tar.gz
```
#### 安装部署
> 注:node_exporter主备数据库服务器及监控服务器本机均需要安装
1. 使用root用户解压安装包至/app/app/node_exporter目录
```bash
mkdir -p /app/node_exporter
tar -zxvf node_exporter-1.2.2.linux-arm64.tar.gz --strip-components 1 -C /app/node_exporter
```
2. 使用root用户配置node_exporter开机启动,新建node_exporter.service文件如下,
```bash
vi /usr/lib/systemd/system/node_exporter.service
```
```bash
[Unit]
Description=Prometheus Node Exporter Service
After=network.target
[Service]
Type=simple
User=root
ExecStart=/app/node_exporter/node_exporter --web.listen-address=:9100 --no-collector.softnet
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
```
> `web.listen-address`为监控服务器本机IP及node_exporter使用端口
3. 启动node_exporter服务
```bash
systemctl daemon-reload
systemctl enable node_exporter
systemctl start node_exporter
```
4. 查看node_exporter服务
```bash
systemctl status node_exporter
curl http://localhost:9100/metrics
```
5. prometheus添加采集
```
vi /app/prometheus/target/os/os_.yml
```
```yml
- targets:
- '127.0.0.1:9100'
labels:
env: product
job: prometheus
instance: 127.0.0.1
```
### og_exporter
#### 安装包下载
软件下载地址:
根据平台进行下载,这里选择arm平台
下载安装介质: opengauss_exporter_1.0.0_linux_arm64.zip
```bash
wget https://gitee.com/opengauss/openGauss-prometheus-exporter/attach_files/973905/download/opengauss_exporter_1.0.0_linux_arm64.zip
```
> og_exporter 可以选择部署在数据库服务器或者其他服务器,只要og_exporter可以连接数据即可
#### 安装部署
1. 使用root用户解压安装包至/app/promethues/opengauss_exporter目录并授权
```bash
mkdir -p /app/opengauss_exporter
unzip opengauss_exporter_1.0.0_linux_arm64.zip -d /app/opengauss_exporter
```
2. 上传 `og_expoter/queries.yaml`文件,放到`/app/opengauss_exporter/`
```bash
cp og_expoter/queries.yaml /app/opengauss_exporter/
```
#### 本地连接监控
本地连接监控需要exporter运行在数据库操作系统用户下
1. 查看数据库socket路径
```
postgres=# show unix_socket_directory;
unix_socket_directory
-----------------------
/tmp
(1 row)
```
2. 配置开启启动
```bash
vi /usr/lib/systemd/system/mogdb_exporter.service
```
```bash
[Unit]
Description=Prometheus MogDB Exporter Service
[Service]
# 启动用户需要和数据库操作系统用户一致
User=omm
Environment="DATA_SOURCE_NAME=host=/tmp port=26000 user=omm dbname=postgres"
ExecStart=/app/opengauss_exporter/opengauss_exporter --auto-discover-databases --exclude-databases="template0,template1" --web.listen-address=:9187 --config=/app/opengauss_exporter/queries.yaml
[Install]
WantedBy=multi-user.target
```
3. 启动并查看dbexporter服务
```bash
systemctl daemon-reload
systemctl enable mogdb_exporter.service
systemctl start mogdb_exporter.service
systemctl status mogdb_exporter.service
```
#### 远程连接监控
1. 被监控的数据库需要创建监控用户,密码复杂度要符合数据库的要求,默认要求大小写+特殊字符,不少于8位
在要监控的数据库创建用户
```bash
gsql -Uomm postgres -r -p 26000
show password_encryption_type; #是否是1, 不是改为1
alter system set password_encryption_type=1;
```
```bash
CREATE USER db_exporter WITH PASSWORD 'Admin@1234' MONADMIN;
grant usage on schema dbe_perf to db_exporter;
grant select on pg_stat_replication to db_exporter;
```
2. 配置pg_hba.conf以md5加密方式添加监控服务器白名单,使用omm用户主库执行如下操作, 如果是本机则把ip改为0.0.0.0/0,
```bash
gs_guc set -I all -N all -h "host postgres db_exporter 110.128.131.16/32 md5"
```
3. 在监控服务器端使用root用户配置db_exporter服务开机启动,新建mogdb_exporter.service 文件如下,
`Environment`为 IP地址及数据库访问监听端口26000
`web.listen-address`为项目管理系统dbexporter使用端口9187
```bash
vi /usr/lib/systemd/system/mogdb_exporter.service
```
```bash
[Unit]
Description=Prometheus MogDB Exporter Service
[Service]
# 需要操作系统存在prometheus无此用户请自行创建
User=prometheus
Environment="DATA_SOURCE_NAME=postgresql://db_exporter:Admin@1234@35.10.3.34:26000/postgres?sslmode=disable"
ExecStart=/app/opengauss_exporter/opengauss_exporter --auto-discover-databases --exclude-databases="template0,template1" --web.listen-address=:9187 --config=/app/opengauss_exporter/queries.yaml
[Install]
WantedBy=multi-user.target
```
> 注:如果是集群则写俩个Environment
4. 启动并查看dbexporter服务
```bash
systemctl daemon-reload
systemctl enable mogdb_exporter.service
systemctl start mogdb_exporter.service
systemctl status mogdb_exporter.service
```
#### Prometheus添加数据库节点
```
vi /app/prometheus/target/db/db_.yml
```
```yml
- targets:
- ':9187'
labels:
job: mogdb_exporter
instance:
```