# analysis_binlog **Repository Path**: mo-shan/analysis_binlog ## Basic Information - **Project Name**: analysis_binlog - **Description**: 分析 MySQL 的 binlog 日志的工具 - **Primary Language**: Shell - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 147 - **Forks**: 41 - **Created**: 2019-04-15 - **Last Updated**: 2025-06-07 ## Categories & Tags **Categories**: dbmanager **Tags**: MySQL, Bash ## README # analysis_binlog # 介绍 - 作为一个MySQL DBA,查看分析binlog是日常工作的一部分,不知道你是否遇到过这样的需求:一个时间段,各个表的dml统计情况。如果binlog文件很多呢?又或者负责的业务线比较多,有多个业务都有这种需求呢?其实需求很简单,但是操作起来有点头疼? - 本文就是针对这类需求做一个测试。如果你在工作中也有类似的烦恼,或者经常需要批量解析binlog,这篇文章或许对你有帮助。 离线分析binlog工具,现有功能: - 1、分析统计各个表的dml次数。 - 2、各个业务表的最后更改时间。 - 3、对应的 dml 总次数。 - 4、binlog的事务总数。 - 5、业务表的binlog to sql。 - 6、多个binlog文件可并行解析。 - 7、其他功能敬请期待。 > git 连接:https://gitee.com/mo-shan/analysis_binlog ## 工具使用介绍 ``` root /data/git/analysis_binlog/bin >> bash analysis_binlog -h analysis_binlog v_1.3, for linux. Usage: bash analysis_binlog [OPTION]... --type=value or -t=value The value=detail | simple For example: --type=detail,-t=detail,-t=simple,-t=simple, The "detail": The results displayed are more detailed, but also take more time. The "simple": The results shown are simple, but save time The default value is "simple". --mysqlbinlog-path or -mpath The path of 'mysqlbinlog' For example: --mysqlbinlog-path=/path/mysqbinlog,-mpath=/path/mysqlbinlog The default value is 'which mysqlbinlog'. --binlog-dir or -bdir Specify a directory for the binlog dir. For example: --binlog-dir=/mysql_binlog_dir,-bdir=/mysql_binlog_dir If the input is a relative path, it will be automatically modified to an absolute path. The default value is "Current path". --binlog-file or -bfile Specify a file for the binlog file, multiple files separated by ",". For example: --binlog-file=/path/mysql_binlog_file,-bfile=/path/mysql_binlog_file --b-file=/path/mysql_binlog_file1,/path/mysql_binlog_file1 If the input is a relative path, it will be automatically modified to an absolute path. If this parameter is used, the "--binlog-dir or -bdir" parameter will be invalid. --sort or -s Sort the results for "INSERT | UPDATE | DELETE | Total" The value=insert | update | delete | total The default value is "total". --threads or -w Decompress/compress the number of concurrent. For example:--threads=8 This parameter works only when there are multiple files. If you use this parameter, specify a valid integer, and the default value is "1". --start-datetime or -stime Start reading the binlog at first event having a datetime equal or posterior to the argument; The argument must be a date and time in the local time zone, in any format accepted by the MySQL server for DATETIME and TIMESTAMP types, for example: -stime="2019-04-28 11:25:56" (you should probably use quotes for your shell to set it properly).. --stop-datetime or -etime Stop reading the binlog at first event having a datetime equal or posterior to the argument; The argument must be a date and time in the local time zone, in any format accepted by the MySQL server for DATETIME and TIMESTAMP types, for example: -etime="2019-04-28 11:25:56" (you should probably use quotes for your shell to set it properly). Applies to the first binlog passed on the command line. --start-position or -spos Start reading the binlog at position N(Integer). Applies to the first binlog passed on the command line. For example: --start-position=154 or -spos=154 --stop-position or -epos Stop reading the binlog at position N(Integer). Applies to the last binlog passed on the command line. For example: --stop-position=154 or -epos=154 --database or -d List entries for just this database (local log only). For example: --database=db_name or -d=db_name --record-type or -rt The value=c | count | t | trans | transaction For example: --record-type=c or -rt=t The "c | count" : The statistic type is the number of times a "DML SQL" has occurred. The "t | trans | transaction": The statistic type is the number of times a "DML transaction" has occurred. The default value is "count". --binlog2sql or -sql Convert binlog file to sql. At this time, the "--type or -t, --sort or -s" option will be invalid. For example: --binlog2sql or -sql --save-way or -sw The value=table | file | all. How to save the analysis results and this option needs to be used with the a option. For example: --save-way=file or -sw=table, the default value is "file". file : Save the results in a file. table: Save the results in different files according to the table name. These file names are called "db.table". all : The effect is equivalent to file and table. --stop Stop all tasks and exit. --help or -h Display this help and exit. ``` ## 工具安装 - 1、克隆项目 ``` git clone https://gitee.com/mo-shan/analysis_binlog.git ``` 进入analysis_binlog的家目录 - 2、更改路径(第一次使用需要配置) (1)更改mysqlbinlog路径 ``` sed -i 's#^mysqlbinlog="/data/mysql/base/bin/mysqlbinlog"#mysqlbinlog=\"/mysqlbinlog_path\"#g' bin/analysis_binlog #将这里的mysqlbinlog_path改成mysqlbinlog工具的绝对路径,否则可能会因版本太低导致错误 ``` (2)更改analysis_binlog家目录路径 ``` sed -i 's#^work_dir=.*#work_dir=\"/analysis_binlog_path\"#g' bin/analysis_binlog #将这里的analysis_binlog_path改成analysis_binlog的家目录的绝对路径 ``` - 3、为analysis_binlog配置环境变量(选做) ``` chmod +x bin/analysis_binlog echo "export PATH=$(pwd)/bin:${PATH}" >> ${HOME}/.bashrc ``` ## 使用测试 ### 使用例子1-统计业务表的dml情况: - 1、根据需求执行 - -bfile: 指定binlog文件, 支持多个文件并行分析, 多个文件用逗号相隔, 需要并行分析时请结合-w参数使用 - -w : 指定并行数, 当需要分析多个binlog文件时该参数有效, 默认是1 - -t : 指定显示结果的格式/内容, 供选选项有"detail|simple". 当指定detail的时候结果较为详细, 会打印详细的分析过程, 消耗时间也不直观, simple只做了统计工作 - -s : 指定排序规则, 供选选项有"insert|update|delete". 默认会把统计结果做一个排序, 按照表的维度统计出insert update delete的次数, 并按照次数大小排序(默认insert) >注: 其他参数使用请参见帮助手册 bash analysis_binlog -h (1)配置了环境变量 ``` analysis_binlog -bfile=/data/mysql/binlog/3306/mysql-bin.000798,/data/mysql/binlog/3306/mysql-bin.000799 -w=2 -t=simple -s=update ``` (2)未配置环境变量 ``` bash bin/analysis_binlog -bfile=/data/mysql/binlog/3306/mysql-bin.000798,/data/mysql/binlog/3306/mysql-bin.000799 -w=2 -t=simple -s=update ``` - 2、结果查询 分析完毕会在analysis_binlog家目录下的res目录下保存一个[binlog_file_name.res]文件,使用文本工具打开即可, 建议使用cat, tail, more, 如下结果展示, 会按照表的维度做个统计, 然后按照update的次数排序, Last Time表示该表的最后一次操作 ``` root /data/git/analysis_binlog/res >> cat mysql-bin.000798.res Table Last Time Insert(s) Update(s) Delete(s) moshan.flush_ 190311 9:28:54 0 3475 0 ultrax.dis_common_syscache 190312 11:31:53 0 231 0 ultrax.dis_common_cron 190312 11:31:53 0 194 0 ultrax.dis_common_session 190312 10:38:56 6 170 5 ultrax.dis_forum_forum 190312 9:19:10 0 129 0 moshan.money 190311 9:28:37 29 80 0 ultrax.dis_common_onlinetime 190312 10:38:42 0 48 0 ultrax.dis_forum_thread 190312 10:38:56 4 47 0 ultrax.dis_common_member_count 190312 10:38:53 0 47 0 ultrax.dis_common_credit_rule_log 190312 10:38:53 0 38 0 ultrax.dis_forum_post 190312 9:24:30 4 34 0 ultrax.dis_common_member_status 190312 9:04:42 0 20 0 moshan.history_ 190308 9:28:25 0 10 0 ice_db.server_setting_tmp 190304 10:34:19 564 8 0 ultrax.dis_common_process 190312 11:31:53 201 7 201 ultrax.dis_common_setting 190312 9:04:42 0 7 0 moshan.tmp_table 190304 17:17:21 0 7 0 ultrax.dis_ucenter_failedlogins 190306 10:07:11 0 4 0 ultrax.dis_common_member_field_home 190311 14:54:47 0 4 0 ultrax.dis_forum_threadcalendar 190312 9:09:56 2 2 0 ultrax.dis_forum_attachment 190306 11:46:56 2 2 0 moshan.use_date 190304 17:12:22 0 1 0 ultrax.dis_forum_threadhot 190312 9:09:56 4 0 0 ultrax.dis_forum_threaddisablepos 190311 14:54:47 1 0 0 ultrax.dis_forum_statlog 190312 9:04:42 304 0 0 ultrax.dis_forum_sofa 190311 14:54:47 4 0 0 ultrax.dis_forum_post_tableid 190311 14:54:47 4 0 0 ultrax.dis_forum_newthread 190311 14:54:47 4 0 6 ultrax.dis_forum_attachment_unused 190306 11:46:56 2 0 2 ultrax.dis_forum_attachment_8 190306 11:46:56 1 0 0 ultrax.dis_forum_attachment_0 190306 11:46:29 1 0 0 ultrax.dis_common_statuser 190311 11:40:44 4 0 4 ultrax.dis_common_searchindex 190312 10:38:53 28 0 0 ultrax.dis_common_member_action_log 190311 14:54:47 4 0 4 test.ttt 190303 11:43:36 2 0 0 test.t_test 190308 16:52:35 4 0 0 test.t_message_list 190313 9:30:16 307544 0 0 test.t_message_content_lately 190313 9:30:16 307544 0 0 test.admin_user 190308 11:51:50 3 0 3 Trans(total) Insert(s) Update(s) Delete(s) 312619 616270 4565 225 root /data/git/analysis_binlog/res >> ``` ### 使用例子2-binlog to sql: - 1、根据需求执行 - --binlog2sql : 表示将binlog分析成sql。 - -sw : 表示将结果按照业务表的维度保存,如果是file则将所有分析结果都保存在一个文件。 - --start-datetime : 开始时间。具体使用请参照mysqlbinlog工具的--start-datetime参数的使用 - --stop-datetime : 结束时间。具体使用,请参照mysqlbinlog工具的--stop-datetime参数的使用 - --start-position : 开始的pos值。具体使用,请参照mysqlbinlog工具的--start-position参数的使用 - --stop-position : 结束的pos值。具体使用,请参照mysqlbinlog工具的--stop-position参数的使用 >注: 其他参数使用请参见帮助手册 bash analysis_binlog -h ``` root /data/git/analysis_binlog >> bash bin/analysis_binlog -bfile=/data/mysql/binlog/3306/mysql-bin.000808 --binlog2sql -sw=table --start-datetime="2019-04-21 9:27:10" --stop-datetime="2019-04-22 10:00:00" --start-postion=1510151 --stop-position=1512137 [2019-04-28 19:37:07] [INFO] [192.168.1.5] Analysing --> /data/mysql/binlog/3306/mysql-bin.000808 [2019-04-28 19:37:08] [INFO] [192.168.1.5] Analysis completed --> /data/mysql/binlog/3306/mysql-bin.000808 ``` - 2、结果查询 分析完毕会在analysis_binlog家目录下的res目录下保存一个[binlog_file_name_to_sql.res]文件,使用文本工具打开即可, 如果【--save-way=table】,则会在res目录下创建【table】目录,该目录下会出现binlog_file_db.table.res的文件,这些文件就是保存了这个该binlog分析出来的sql语句, 如下结果展示 ``` root /data/git/analysis_binlog >> ll res total 836 drwxrwxr-x 3 moshan moshan 4096 Apr 28 19:45 ./ drwxrwxr-x 7 moshan moshan 4096 Apr 28 19:18 ../ -rw-rw-r-- 1 moshan moshan 0 Apr 28 19:16 .keep -rw-r--r-- 1 root root 789967 Apr 28 19:46 mysql-bin.000808_to_sql.res drwxr-xr-x 2 root root 4096 Apr 28 19:46 table/ root /data/git/analysis_binlog >> root /data/git/analysis_binlog >> cat res/mysql-bin.000808_to_sql.res |more BEGIN /*time:190417 17:36:38*/ UPDATE ultrax.dis_common_member_status WHERE @1=2 @2='192.168.88.188' @3='172.18.55.193' @4=-13568 (51968) @5=1555468129 @6=1555468129 @7=1555480860 @8=0 @9=0 @10=0 @11=0 @12=0 @13=0 @14=28 SET @1=2 @2='192.168.88.188' @3='172.18.55.193' @4=-8416 (57120) @5=1555493798 @6=1555468129 @7=1555480860 @8=0 @9=0 @10=0 @11=0 @12=0 @13=0 @14=28 COMMIT BEGIN /*time:190417 17:36:38*/ UPDATE ultrax.dis_common_member_status WHERE @1=2 @2='192.168.88.188' @3='172.18.55.193' @4=-8416 (57120) @5=1555493798 root /data/git/analysis_binlog >> ll res/table/ total 860 drwxr-xr-x 2 root root 4096 Apr 28 19:46 ./ drwxrwxr-x 3 moshan moshan 4096 Apr 28 19:45 ../ -rw-r--r-- 1 root root 287026 Apr 28 19:46 mysql-bin.000808_moshan.flush_.log -rw-r--r-- 1 root root 17594 Apr 28 19:46 mysql-bin.000808_moshan.info.log -rw-r--r-- 1 root root 326 Apr 28 19:46 mysql-bin.000808_moshan.t_1.log -rw-r--r-- 1 root root 7603 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_credit_rule_log.log -rw-r--r-- 1 root root 53099 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_cron.log -rw-r--r-- 1 root root 1667 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_member_action_log.log -rw-r--r-- 1 root root 15415 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_member_count.log -rw-r--r-- 1 root root 7079 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_member_field_home.log -rw-r--r-- 1 root root 12329 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_member.log -rw-r--r-- 1 root root 8336 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_member_status.log -rw-r--r-- 1 root root 4534 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_onlinetime.log -rw-r--r-- 1 root root 32934 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_process.log -rw-r--r-- 1 root root 10566 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_searchindex.log -rw-r--r-- 1 root root 40506 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_session.log -rw-r--r-- 1 root root 484 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_setting.log -rw-r--r-- 1 root root 3739 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_stat.log -rw-r--r-- 1 root root 496 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_statuser.log -rw-r--r-- 1 root root 28473 Apr 28 19:46 mysql-bin.000808_ultrax.dis_common_syscache.log -rw-r--r-- 1 root root 204213 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_forum.log -rw-r--r-- 1 root root 819 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_newthread.log -rw-r--r-- 1 root root 11655 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_post.log -rw-r--r-- 1 root root 485 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_post_tableid.log -rw-r--r-- 1 root root 495 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_sofa.log -rw-r--r-- 1 root root 20746 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_statlog.log -rw-r--r-- 1 root root 1225 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_threadcalendar.log -rw-r--r-- 1 root root 1110 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_threaddisablepos.log -rw-r--r-- 1 root root 497 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_threadhot.log -rw-r--r-- 1 root root 32558 Apr 28 19:46 mysql-bin.000808_ultrax.dis_forum_thread.log root /data/git/analysis_binlog >> cat res/table/mysql-bin.000808_moshan.t_1.log BEGIN /*time:190418 17:34:08*/ INSERT INTO moshan.t_1 SET @1=1 COMMIT BEGIN /*time:190418 17:34:09*/ INSERT INTO moshan.t_1 SET @1=2 COMMIT BEGIN /*time:190418 18:05:43*/ INSERT INTO moshan.t_1 SET @1=3 @2='a' COMMIT BEGIN /*time:190418 18:06:29*/ INSERT INTO moshan.t_1 SET @1='b' @2=4 COMMIT root /data/git/analysis_binlog >> ``` > 提示:v_1.1版本引入新参数--record-type or -rt - 该参数表示以什么方式统计,可选的方式是两种,一是统计sql的个数,二统计事务的个数,默认是统计sql的个数。 如:某表有十行记录,现在执行delete from t; 如果binlog是row格式,这时候记录到binlog会是十个delete语句,但是是一个事务。这时候这个参数就起作用了。用户可以按照需求并参考使用手册使用该参数。但是需要注意的是如果一个事务里面存在多种dml,比如begin;insert into t select 1;update t2 set c=c+1;commit;对于这样的事务,会将该事务记到t2的update操作,不会记录到t的insert操作。