diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/README.md b/plugins/tensorboard-plugins/tb_graph_ascend/README.md
index 737292598ecfdf0fb39d676ce3352fb4b246076e..cd578e3f4d7d597229854e1362afe05753600f03 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/README.md
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/README.md
@@ -8,62 +8,64 @@
### 1. 相关依赖
- `python >= 3.7 ,tensorboard >= 2.11.2,numpy <= 1.26.3`
+`python >= 3.7 ,tensorboard >= 2.11.2,numpy <= 1.26.3`
### 2. 安装方式
#### 2.1 pip 安装(推荐)
- - 现本插件已经上传到 pypi 社区,用户可在 python 环境下直接通过以下 pip 指令进行安装:
- ```
- pip install tb-graph-ascend
- ```
- - 也可在 pypi 社区上下载离线 whl 包,传输到无法访问公网的环境上离线安装使用。访问[下载链接](https://pypi.org/project/tb-graph-ascend/#files)选择 whl 包进行下载,之后便可使用指令安装(此处{version}为 whl 包实际版本)
- ```
- pip install tb-graph_ascend_{version}-py3-none-any.whl
- ```
+- 现本插件已经上传到 pypi 社区,用户可在 python 环境下直接通过以下 pip 指令进行安装:
+ ```
+ pip install tb-graph-ascend
+ ```
+- 也可在 pypi 社区上下载离线 whl 包,传输到无法访问公网的环境上离线安装使用。访问[下载链接](https://pypi.org/project/tb-graph-ascend/#files)选择 whl 包进行下载,之后便可使用指令安装(此处{version}为 whl 包实际版本)
+ ```
+ pip install tb-graph_ascend_{version}-py3-none-any.whl
+ ```
#### 2.2 从源代码安装
1. 从仓库下载源码并切换到 master 分支:
- ```
- git clone https://gitee.com/ascend/mstt.git -b master
- ```
+ ```
+ git clone https://gitee.com/ascend/mstt.git -b master
+ ```
2. 进入目录 `plugins/tensorboard-plugins/tb_graph_ascend` 下
3. 编译前端代码,根据操作系统选取不同指令
- ```
- cd fe
- // 安装前端依赖
- npm install --force
- // Windows系统
- npm run buildWin
- // 其他可使用cp指令的系统,如Linux或Mac
- npm run buildLinux
- ```
+ ```
+ cd fe
+ // 安装前端依赖
+ npm install --force
+ // Windows系统
+ npm run buildWin
+ // 其他可使用cp指令的系统,如Linux或Mac
+ npm run buildLinux
+ ```
- **注意**: 此步骤需要安装 [Node.js](https://nodejs.org/zh-cn/download) 环境
+ **注意**: 此步骤需要安装 [Node.js](https://nodejs.org/zh-cn/download) 环境
4. 回到上级目录直接安装:
- ```
- cd ../
- python setup.py develop
- ```
- - 或: 构建 whl 包安装
- ```
- python setup.py bdist_wheel
- ```
- 在 `plugins/tensorboard-plugins/tb_graph_ascend/dist` 目录下取出 whl 包,使用以下指令安装(此处{version}为 whl 包实际版本)
- ```
- pip install tb-graph_ascend_{version}-py3-none-any.whl
- ```
+ ```
+ cd ../
+ python setup.py develop
+ ```
+
+- 或: 构建 whl 包安装
+ ```
+ python setup.py bdist_wheel
+ ```
+ 在 `plugins/tensorboard-plugins/tb_graph_ascend/dist` 目录下取出 whl 包,使用以下指令安装(此处{version}为 whl 包实际版本)
+ ```
+ pip install tb-graph_ascend_{version}-py3-none-any.whl
+ ```
### 3. 解析数据说明
- 将通过[msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe#10-%E5%88%86%E7%BA%A7%E5%8F%AF%E8%A7%86%E5%8C%96%E6%9E%84%E5%9B%BE%E6%AF%94%E5%AF%B9)工具构图功能采集得到的文件后缀为.vis 的模型结构文件(文件本身为 json 格式)放置于某个文件夹中,路径名称下文称之为 `output_path`
- - E.g. \
+将通过[msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe#10-%E5%88%86%E7%BA%A7%E5%8F%AF%E8%A7%86%E5%8C%96%E6%9E%84%E5%9B%BE%E6%AF%94%E5%AF%B9)工具构图功能采集得到的文件后缀为.vis 的模型结构文件(文件本身为 json 格式)放置于某个文件夹中,路径名称下文称之为 `output_path`
+
+- E.g. \
`---output_path` \
`-----output.vis` \
`-----output2.vis`
@@ -90,39 +92,47 @@
注意:如果`--logdir` 指定目录下的文件太大或太多,请等候,刷新浏览器查看加载结果。
-3. 建议在本地启动tensorboard,如果网络浏览器与启动 TensorBoard 的机器不在同一台机器上,需要远程启动,可参照[远程启动方式](#413-远程查看数据),但需用户自行评估**安全风险**。
+3. 建议在本地启动 tensorboard,如果网络浏览器与启动 TensorBoard 的机器不在同一台机器上,需要远程启动,可参照[远程启动方式](#413-远程查看数据),但需用户自行评估**安全风险**。
## 三、浏览器查看
+
**注意:本工具不支持同时通过多个浏览器窗口同时访问同一个 TensorBoard 服务,否则会出现页面无法正常显示的情况。**
### 3.1 主界面
-

### 3.2 操作方式:
-- **节点双击打开,单击选中。**
-- **选中的节点边框呈现蓝色,比对场景下若其存在对应节点,则对应节点边框为浅蓝色。**
-- **键盘 WS 根据鼠标位置放大缩小,AD 左右移动。**
-- **鼠标滚轮上下移动,鼠标可拖动页面。**
-- **比对场景鼠标右键可选中节点,并可展开至对应侧的节点并选中。**
+- **节点双击打开,单击选中。**
+- **选中的节点边框呈现蓝色,比对场景下若其存在对应节点,则对应节点边框为浅蓝色。**
+- **键盘 WS 根据鼠标位置放大缩小,AD 左右移动。**
+- **鼠标滚轮上下移动,鼠标可拖动页面。**
+- **比对场景鼠标右键可选中节点,并可展开至对应侧的节点并选中。**

+
### 3.3 名称搜索
+

+
### 3.4 精度筛选/溢出筛选
+
注意:单图场景不存在精度筛选和溢出筛选,下图为双图比对场景。

+
### 3.5 未匹配节点筛选
+
参考匹配说明 ,不符合匹配规则的节点为无匹配节点,颜色标灰。适用于排查两个模型结构差异的场景。

+
### 3.6 手动选择节点匹配
+
可通过浏览器界面,通过鼠标选择两个待匹配的灰色节点进行匹配。当前暂不支持真实数据模式。
如果选中"操作选中节点及其子节点":
-点击匹配后会将两个节点及其子节点按照Module名称依次匹配,取消匹配后会将子节点的匹配关系清除。
+点击匹配后会将两个节点及其子节点按照 Module 名称依次匹配,取消匹配后会将子节点的匹配关系清除。
否则:
点击匹配后只会将两个节点进行匹配,取消匹配后会将节点的匹配关系清除
注意:匹配结束之后,需要点击保存才能持久化到源文件里面
@@ -130,43 +140,52 @@

### 3.7 生成匹配配置文件
+
可保存已经已匹配节点的匹配关系到配置文件中,并支持读取配置文件中的数据,进行匹配操作。
-默认保存在当前目录下,文件名为`[当前文件名].vis.config`,每次切换文件都会扫描当前录下的后缀名为.vis.config配置文件,并更新配置文件列表。
+默认保存在当前目录下,文件名为`[当前文件名].vis.config`,每次切换文件都会扫描当前录下的后缀名为.vis.config 配置文件,并更新配置文件列表。
注意:匹配结束之后,需要点击保存才能持久化到源文件里面

+### 3.8 支持用户自定义精度指标配置
+
## 四、附录
### 4.1 安全加固建议
#### 4.1.1 免责声明
+
本工具为基于 TensorBoard 底座开发的插件,使用本插件需要基于 TensorBoard 运行,请自行关注 TensorBoard 相关安全配置和安全风险。
-打开本工具时,本工具会对logdir目录下的vis文件以及其父目录进行安全检查,如果存在安全风险,本工具会展示如下提示信息,询问用户是否继续执行,用户选择继续执行后,可以操作未通过安全检查的文件和目录,用户需要自行承担操作风险。如果用户选择不继续执行,则用户只能操作通过安全检查的文件。
+打开本工具时,本工具会对 logdir 目录下的 vis 文件以及其父目录进行安全检查,如果存在安全风险,本工具会展示如下提示信息,询问用户是否继续执行,用户选择继续执行后,可以操作未通过安全检查的文件和目录,用户需要自行承担操作风险。如果用户选择不继续执行,则用户只能操作通过安全检查的文件。

-#### 4.1.2 TensorBoard版本说明
+
+#### 4.1.2 TensorBoard 版本说明
+
满足[相关依赖](#1-相关依赖)中要求的 TensorBoard 版本皆可正常使用本插件功能,但为 TensorBoard 本身安全风险考虑,建议使用最新版本 TensorBoard 。
+
#### 4.1.3 远程查看数据
如果网络浏览器与启动 TensorBoard 的机器不在同一台机器上, TensorBoard 提供了远程查看数据的指令启动方式,但此种方式会将服务器对应端口在局域网内公开(全零监听),请用户自行关注安全风险。
- * 在启动指令尾部加上`--bind_all`或`--host={服务器IP}`参数启用远程查看方式,如:
+- 在启动指令尾部加上`--bind_all`或`--host={服务器IP}`参数启用远程查看方式,如:
- ```
- tensorboard --logdir output_path --port=6006 --host=xxx.xxx.xxx.xxx
- 或
- tensorboard --logdir output_path --port=6006 --bind_all
- ```
+ ```
+ tensorboard --logdir output_path --port=6006 --host=xxx.xxx.xxx.xxx
+ 或
+ tensorboard --logdir output_path --port=6006 --bind_all
+ ```
- * 在打开浏览器访问界面时,需将 URL 内主机名由`localhost`替换为主机的 ip 地址,如`http://xxx.xxx.xxx.xxx:6006`
+- 在打开浏览器访问界面时,需将 URL 内主机名由`localhost`替换为主机的 ip 地址,如`http://xxx.xxx.xxx.xxx:6006`
### 4.2 通信矩阵
-| 序号 | 代码仓 | 功能 | 源设备 | 源IP | 源端口 | 目的设备 | 目的IP | 目的端口
(侦听) | 协议 | 端口说明 | 端口配置| 侦听端口是否可更改 | 所属平面 | 版本 | 特殊场景 | 备注 |
-|:----|:---|:--|:--|:---|:---|:---|:----|:--|:--|:---|:---|:---|:---|:-----|:-----|:---|
-| 1 | tensorboard-plugins | TensorBoard底座前后端通信 | 访问TensorBoard浏览器所在机器 | 访问TensorBoard浏览器所在机器ip | | TensorBoard服务所在机器 | TensorBoard服务所在服务器的ip | 6006 | HTTP | tensorboard服务通信 | `--port` | 可修改 | 业务面 | 所有版本 | 无 | |
+
+| 序号 | 代码仓 | 功能 | 源设备 | 源 IP | 源端口 | 目的设备 | 目的 IP | 目的端口
(侦听) | 协议 | 端口说明 | 端口配置 | 侦听端口是否可更改 | 所属平面 | 版本 | 特殊场景 | 备注 |
+| :--- | :------------------ | :------------------------- | :------------------------------ | :--------------------------------- | :----- | :----------------------- | :------------------------------ | :-------------------- | :--- | :------------------- | :------- | :----------------- | :------- | :------- | :------- | :--- |
+| 1 | tensorboard-plugins | TensorBoard 底座前后端通信 | 访问 TensorBoard 浏览器所在机器 | 访问 TensorBoard 浏览器所在机器 ip | | TensorBoard 服务所在机器 | TensorBoard 服务所在服务器的 ip | 6006 | HTTP | tensorboard 服务通信 | `--port` | 可修改 | 业务面 | 所有版本 | 无 | |
+
### 4.3 公网地址说明
-[公网地址说明](./doc/公网地址说明.csv)
+[公网地址说明](./doc/公网地址说明.csv)
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_update_precision.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_update_precision.png
new file mode 100644
index 0000000000000000000000000000000000000000..b764fc983c0178e6f2f1d77807a6a4635a7dbd9e
Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_update_precision.png differ
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json
index 743efb7a0103e2718ada03cf069d4ad3396bdb8d..00185f250091f7a5d19fc126fb1441716015e6fd 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json
@@ -19,6 +19,7 @@
"@polymer/polymer": "^3.5.1",
"@vaadin/button": "24.6.5",
"@vaadin/checkbox": "24.6.5",
+ "@vaadin/checkbox-group": "^24.6.5",
"@vaadin/combo-box": "24.6.5",
"@vaadin/confirm-dialog": "24.6.5",
"@vaadin/context-menu": "24.6.5",
@@ -993,6 +994,24 @@
"lit": "^3.0.0"
}
},
+ "node_modules/@vaadin/checkbox-group": {
+ "version": "24.6.5",
+ "resolved": "https://registry.npmmirror.com/@vaadin/checkbox-group/-/checkbox-group-24.6.5.tgz",
+ "integrity": "sha512-1K34LnXxINlMSrwAynLW46nyAGqz6kZW4ogZeKESXa+JogjOiHCaVy127xIKYmfJD2yR4ti31VPQKPNQXlZpxA==",
+ "license": "Apache-2.0",
+ "dependencies": {
+ "@open-wc/dedupe-mixin": "^1.3.0",
+ "@polymer/polymer": "^3.0.0",
+ "@vaadin/a11y-base": "~24.6.5",
+ "@vaadin/checkbox": "~24.6.5",
+ "@vaadin/component-base": "~24.6.5",
+ "@vaadin/field-base": "~24.6.5",
+ "@vaadin/vaadin-lumo-styles": "~24.6.5",
+ "@vaadin/vaadin-material-styles": "~24.6.5",
+ "@vaadin/vaadin-themable-mixin": "~24.6.5",
+ "lit": "^3.0.0"
+ }
+ },
"node_modules/@vaadin/combo-box": {
"version": "24.6.5",
"resolved": "https://registry.npmmirror.com/@vaadin/combo-box/-/combo-box-24.6.5.tgz",
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json
index f3416a523419b3b6b9367eb5258e56aa7e317f9c..9469af8c44eacb144274ee17ca16c297da3ac43b 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json
@@ -38,6 +38,7 @@
"@polymer/polymer": "^3.5.1",
"@vaadin/button": "24.6.5",
"@vaadin/checkbox": "24.6.5",
+ "@vaadin/checkbox-group": "^24.6.5",
"@vaadin/combo-box": "24.6.5",
"@vaadin/confirm-dialog": "24.6.5",
"@vaadin/context-menu": "24.6.5",
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_color_select/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_color_select/index.ts
index 44c9f5a7860c67ce497aa646d92755f19be31a42..10bedce9b586c57cc87b73b87690f42abf9d8ed0 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_color_select/index.ts
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_color_select/index.ts
@@ -15,6 +15,7 @@
*/
import '@vaadin/combo-box';
+import '@vaadin/text-field';
import * as _ from 'lodash';
import { PolymerElement, html } from '@polymer/polymer';
import { Notification } from '@vaadin/notification';
@@ -25,7 +26,7 @@ import request from '../../../utils/request';
import { DarkModeMixin } from '../../../polymer/dark_mode_mixin';
import { LegacyElementMixin } from '../../../polymer/legacy_element_mixin';
import { PRECISION_DESC } from '../../../common/constant';
-
+import '../tf_filter_precision_error/index'
const UNMATCHED_NODE_NAME = '无匹配节点';
@customElement('tf-color-select')
class Legend extends LegacyElementMixin(DarkModeMixin(PolymerElement)) {
@@ -193,6 +194,7 @@ class Legend extends LegacyElementMixin(DarkModeMixin(PolymerElement)) {
>
+
@@ -336,11 +338,15 @@ class Legend extends LegacyElementMixin(DarkModeMixin(PolymerElement)) {
+
`;
@property({ type: Boolean })
_colorSetting: boolean = true; // 颜色设置按钮
+ @property({ type: Boolean })
+ filterDialogOpened: boolean = false;
+
@property({ type: Boolean })
isSingleGraph = false;
@@ -483,11 +489,41 @@ class Legend extends LegacyElementMixin(DarkModeMixin(PolymerElement)) {
}
}
}
+ // 请求后端接口,更新筛选数据
+ updateFilterData = async () => {
+ if (_.isEmpty(this.selectColor)) {
+ return;
+ }
+ try {
+ const params = {
+ run: this.selection.run,
+ tag: this.selection.tag,
+ microStep: this.selection.microStep,
+ precision_index: this.selectColor.join(','),
+ };
+
+ const precisionmenu = await request({ url: 'screen', method: 'GET', params: params });
+ this.set('precisionmenu', precisionmenu);
+ this.set('selectedPrecisionNode', precisionmenu?.[0] || '');
+ }
+ catch (error) {
+ Notification.show(`获取精度菜单失败,请检查 toggleCheckbox 和 vis 文件中的数据。`, {
+ position: 'middle',
+ duration: 4000,
+ theme: 'error',
+ });
+ }
+ }
toggleVisibility(): void {
this.set('_colorSetting', !this._colorSetting);
}
+ _clickFilter(event): void {
+ event.stopPropagation();
+ this.set('filterDialogOpened', true);
+ }
+
_clickSetting(event): void {
event.stopPropagation();
this.set('_colors', true);
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_filter_precision_error/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_filter_precision_error/index.ts
new file mode 100644
index 0000000000000000000000000000000000000000..8ddf63321ad355d134eb83797e97cf7f142a17a0
--- /dev/null
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_filter_precision_error/index.ts
@@ -0,0 +1,104 @@
+import '@vaadin/checkbox';
+import '@vaadin/confirm-dialog'
+import '@vaadin/checkbox-group';
+import '@vaadin/text-field';
+import { Notification } from '@vaadin/notification';
+import { customElement, property, observe } from '@polymer/decorators';
+import { html, PolymerElement } from '@polymer/polymer';
+import request from '../../../utils/request';
+import { isEmpty } from 'lodash';
+
+@customElement('tf-filter-precision-error')
+class TfFilterPrecisionError extends PolymerElement {
+ static readonly template = html`
+
+
+
+
+
+
+
+
+ `
+
+ @property({ type: Boolean, notify: true })
+ filterDialogOpened: boolean = false;
+
+ @property({ type: Array })
+ filterValue: string[] = [];
+
+ @property({ type: Object })
+ selection: any;
+
+ @property({ type: Object })
+ updateFilterData: Function = () => { };
+
+ MAX_RELATIVE_ERR = "0";
+ MIN_RELATIVE_ERR = "1";
+ MEAN_RELATIVE_ERR = "2";
+ NORM_RELATIVE_ERR = "3";
+
+ @observe('selection')
+ _selectionChanged() {
+ this.set('filterValue', [this.MAX_RELATIVE_ERR, this.MIN_RELATIVE_ERR, this.MEAN_RELATIVE_ERR, this.NORM_RELATIVE_ERR]);
+ }
+ override ready(): void {
+ super.ready();
+ const filterDialog = this.shadowRoot?.querySelector('#filter-dialog') as HTMLElement;
+ filterDialog?.addEventListener('confirm', this.onFlterDialogConfirm)
+ this.set('filterValue', [this.MAX_RELATIVE_ERR, this.MIN_RELATIVE_ERR, this.MEAN_RELATIVE_ERR, this.NORM_RELATIVE_ERR]);
+ }
+ onFlterDialogConfirm = async (e: any) => {
+ if (isEmpty(this.filterValue)) {
+ Notification.show(`错误: 精度误差计算指标为空,请选择指标`, {
+ position: 'middle',
+ duration: 1800,
+ theme: 'error',
+ });
+ setTimeout(() => {
+ this.set('filterDialogOpened', true);
+ }, 1800)
+ return;
+ }
+ const data = {
+ metaData: this.selection,
+ filterValue: this.filterValue
+ };
+ const { success, error } = await request({ url: 'updatePrecisionError', method: 'POST', data });
+ if (success) {
+ const updateHierarchyData = new CustomEvent('updateHierarchyData', { bubbles: true, composed: true });
+ this.dispatchEvent(updateHierarchyData);
+ this.set('filterDialogOpened', false);
+ this.updateFilterData();
+ Notification.show(`操作成功:精度误差值已更新`, {
+ position: 'middle',
+ duration: 2000,
+ theme: 'success',
+ });
+ }
+ else {
+ Notification.show(`精度误差计算错误${error}`, {
+ position: 'middle',
+ duration: 1800,
+ theme: 'error',
+ });
+ setTimeout(() => {
+ this.set('filterDialogOpened', true);
+ }, 1800)
+ }
+ }
+
+}
\ No newline at end of file
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_manual_match/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_manual_match/index.ts
index d9211712a0455ab450194f7c86b51034de7d3c1a..97b4899f7f5febd263ae7d806de6fe32bbda7f6c 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_manual_match/index.ts
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_controls_board/components/tf_manual_match/index.ts
@@ -67,12 +67,14 @@ class Legend extends PolymerElement {
.match-checkbox {
font-size: 14px;
}
+
.vaadin-details-title {
font-size: 14px;
color: #333333;
font-weight: 600;
margin-bottom: 0;
}
+
.vaadin-details vaadin-details-summary {
font-size: 15px;
color: #333333;
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_info_board/components/tf_vaddin_text_table/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_info_board/components/tf_vaddin_text_table/index.ts
index 7e97498e6c4e143f8cc33b42c3a528445e1c82d9..e490f5e9b598d9f4fb12c6aa919da4cd3aada593 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_info_board/components/tf_vaddin_text_table/index.ts
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/graph_info_board/components/tf_vaddin_text_table/index.ts
@@ -70,7 +70,7 @@ class TfVaadinTable extends PolymerElement {
cursor: pointer;
position: relative;
right: 58px;
- bottom: 106px;
+ bottom: 180px;
}
.copy-button:hover {
background: #0056b3;
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py
index a15005adbe008467cb8b18f5fbadae9fb958df3d..904247786c215a8232c71a4aa71d2cd980dbe3bd 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py
@@ -30,6 +30,16 @@ class MatchNodesController:
return False
return True
+ @staticmethod
+ def get_opposite_node_name(node_name):
+ opposite_node_name = ''
+ # 如果npu_node_name包含forward,则opposite_npu_node_name为npu_node_name替换forward为backward
+ if 'forward' in node_name:
+ opposite_node_name = node_name.replace('forward', 'backward')
+ else:
+ opposite_node_name = node_name.replace('backward', 'forward')
+ return opposite_node_name
+
@staticmethod
def process_task_add(graph_data, npu_node_name, bench_node_name, task):
if not MatchNodesController.is_same_node_type(graph_data, npu_node_name, bench_node_name):
@@ -39,29 +49,45 @@ class MatchNodesController:
}
result = {}
+ opposite_result = {}
+ opposite_npu_node_name = MatchNodesController.get_opposite_node_name(npu_node_name)
+ opposite_bench_node_name = MatchNodesController.get_opposite_node_name(bench_node_name)
if task == 'md5':
result = MatchNodesController.process_md5_task_add(graph_data, npu_node_name, bench_node_name)
+ opposite_result = MatchNodesController.process_md5_task_add(graph_data, opposite_npu_node_name, opposite_bench_node_name)
elif task == 'summary':
result = MatchNodesController.process_summary_task_add(graph_data, npu_node_name, bench_node_name)
+ opposite_result = MatchNodesController.process_summary_task_add(graph_data, opposite_npu_node_name, opposite_bench_node_name)
else:
result = {
'success': False,
'error': 'task类型错误'
}
+ result['success'] = result.get('success') or opposite_result.get('success')
+ if not result.get('success'):
+ result['error'] = f'当前节点:{result.get("error",'')}。对侧节点:{opposite_result.get("error")}'
return result
@staticmethod
def process_task_delete(graph_data, npu_node_name, bench_node_name, task):
result = {}
+ opposite_result = {}
+ opposite_npu_node_name = MatchNodesController.get_opposite_node_name(npu_node_name)
+ opposite_bench_node_name = MatchNodesController.get_opposite_node_name(bench_node_name)
if task == 'md5':
result = MatchNodesController.process_md5_task_delete(graph_data, npu_node_name, bench_node_name)
+ opposite_result = MatchNodesController.process_md5_task_delete(graph_data, opposite_npu_node_name, opposite_bench_node_name)
elif task == 'summary':
result = MatchNodesController.process_summary_task_delete(graph_data, npu_node_name, bench_node_name)
+ opposite_result = MatchNodesController.process_summary_task_delete(graph_data, opposite_npu_node_name, opposite_bench_node_name)
else:
result = {
'success': False,
'error': 'task类型错误'
}
+ result['success'] = result.get('success') or opposite_result.get('success')
+ if not result.get('success'):
+ result['error'] = f'当前节点:{result.get("error",'')}。对侧节点:{opposite_result.get("error")}'
return result
@staticmethod
@@ -215,8 +241,10 @@ class MatchNodesController:
@staticmethod
def process_md5_task_add(graph_data, npu_node_name, bench_node_name):
- npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name, {})
- bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name, {})
+ npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name)
+ bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name)
+ if not npu_node_data or not bench_node_data:
+ return {'success': False, 'error': '节点不存在'}
# 去除节点名称前缀
npu_input_data = GraphUtils.remove_prefix(npu_node_data.get('input_data', {}), npu_node_name + '.')
bench_input_data = GraphUtils.remove_prefix(bench_node_data.get('input_data', {}), bench_node_name + '.')
@@ -285,8 +313,13 @@ class MatchNodesController:
'success': False,
'error': "操作失败:节点未匹配,请先匹配节点",
}
- npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name, {})
- bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name, {})
+ npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name)
+ bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name)
+ if not npu_node_data or not bench_node_data:
+ return {
+ 'success': False,
+ 'error': "操作失败:节点不存在",
+ }
# 在原始数据上,删除匹配节点,和匹配节点信息
npu_node_data['matched_node_link'] = []
bench_node_data['matched_node_link'] = []
@@ -309,8 +342,13 @@ class MatchNodesController:
'success': False,
'error': "操作失败:节点未匹配,请先匹配节点",
}
- npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name, {})
- bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name, {})
+ npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name)
+ bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name)
+ if not npu_node_data or not bench_node_data:
+ return {
+ 'success': False,
+ 'error': "操作失败:节点不存在",
+ }
# 在原始数据上,删除匹配节点,和匹配节点信息
npu_node_data['matched_node_link'] = []
bench_node_data['matched_node_link'] = []
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/json_graph_service.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/json_graph_service.py
index a900850323ec091db6346d593f70eae935260e2a..ba95135dbe58e575db5db96e2de6cd37efd3fb71 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/json_graph_service.py
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/json_graph_service.py
@@ -24,12 +24,14 @@ from ..utils.global_state import GraphState
from ..controllers.match_nodes_controller import MatchNodesController
from ..controllers.layout_hierarchy_controller import LayoutHierarchyController
from ..utils.global_state import NPU_PREFIX, BENCH_PREFIX, NPU, BENCH, SINGLE
+from ..utils.global_state import MAX_RELATIVE_ERR, MIN_RELATIVE_ERR, MEAN_RELATIVE_ERR, NORM_RELATIVE_ERR
from .base_graph_service import GraphServiceStrategy
logger = tb_logging.get_logger()
class JsonGraphService(GraphServiceStrategy):
+
def __init__(self, run_path, tag):
super().__init__(run_path, tag)
@@ -197,6 +199,51 @@ class JsonGraphService(GraphServiceStrategy):
node_type_name = '调试侧' if graph_type == NPU else '标杆侧'
return {'success': False, 'error': f'{node_type_name}节点展开或收起发生错误', 'data': None}
+ def update_precision_error(self, meta_data, filter_value):
+ try:
+ graph_data, error_message = GraphUtils.get_graph_data(meta_data)
+ if error_message:
+ return {'success': False, 'error': error_message}
+ npu_node_list = graph_data.get(NPU, {}).get('node', {})
+ for _, node_info in npu_node_list.items():
+ output_statistical_diff = node_info.get('output_data', None)
+ if not node_info.get('matched_node_link') or not output_statistical_diff:
+ continue
+ max_rel_error = -1
+ # 根据filter_value 的选择指标计算新的误差值
+ for _, diff_values in output_statistical_diff.items():
+ filter_diff_rel = []
+ if MAX_RELATIVE_ERR in filter_value:
+ filter_diff_rel.append(diff_values.get('MaxRelativeErr'))
+ if MIN_RELATIVE_ERR in filter_value:
+ filter_diff_rel.append(diff_values.get('MinRelativeErr'))
+ if NORM_RELATIVE_ERR in filter_value:
+ filter_diff_rel.append(diff_values.get('NormRelativeErr'))
+ if MEAN_RELATIVE_ERR in filter_value:
+ filter_diff_rel.append(diff_values.get('MeanRelativeErr'))
+ # 过滤掉N/A
+ filter_diff_rel = [x for x in filter_diff_rel if x and x != 'N/A']
+ # 如果output指标中存在 Nan/inf/-inf, 直接标记为最大值
+ if "Nan" in filter_diff_rel or "inf" in filter_diff_rel or "-inf" in filter_diff_rel:
+ max_rel_error = 1
+ break
+ filter_diff_rel = [GraphUtils.convert_to_float(x) for x in filter_diff_rel]
+ max_rel_error_for_key = max(filter_diff_rel) if filter_diff_rel else 0
+ max_rel_error = max(max_rel_error, max_rel_error_for_key)
+ if max_rel_error != -1:
+ node_info.setdefault('data', {})['precision_index'] = min(max_rel_error, 1)
+ return {'success': True, 'data': {}}
+ except Exception as e:
+ logger.error('更新精度误差失败:' + str(e))
+ return {'success': False, 'error': str(e)}
+
+ def update_hierarchy_data(self, graph_type):
+ if (graph_type == NPU or graph_type == BENCH):
+ hierarchy = LayoutHierarchyController.update_hierarchy_data(graph_type)
+ return {'success': True, 'data': hierarchy}
+ else:
+ return {'success': False, 'error': '节点类型错误'}
+
def get_node_info(self, node_info, meta_data):
graph_data, error_message = GraphUtils.get_graph_data(meta_data)
if error_message:
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py
index 9ed8c4920e62f62df8241a9b1b24876ae3e098f6..02db7d15b63fc863a0ad3b251376344b481e034b 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py
@@ -49,6 +49,12 @@ API = 1
MULTI_COLLECTION = 8
API_LIST = 9
+# 计算指标
+MAX_RELATIVE_ERR = "0"
+MIN_RELATIVE_ERR = "1"
+MEAN_RELATIVE_ERR = "2"
+NORM_RELATIVE_ERR = "3"
+
class GraphState:
# 模块级全局变量
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py
index 569b55f99a4c153d81c651b9c6743a22648fb6ec..5e77cd88c58e206d742703f0d12102a905e23c2c 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py
@@ -173,6 +173,12 @@ class GraphUtils:
@staticmethod
def convert_to_float(value):
try:
+ if isinstance(value, str):
+ # 处理'0.0%, 由于Mean小于1e-06, 建议不参考此相对误差,请参考绝对误差'和'0.0%'的情况
+ value = value.split(',')[0]
+ if value.endswith('%'):
+ value = value.replace('%', '').strip()
+ return float(value) / 100.0
return float(value)
except ValueError:
return float('nan')
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py
index d590d3c533d3729da0dabc166e1d94657d9591e1..d3fe234c564309afa401c3df22f947d85b4cd767 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py
@@ -98,6 +98,18 @@ class GraphView:
result = strategy.load_graph_all_node_list(meta_data)
response = http_util.Respond(request, result, "application/json")
return response
+
+ # 更新误差节点
+ @staticmethod
+ @wrappers.Request.application
+ @check_file_type
+ def update_precision_error(request):
+ data = GraphUtils.safe_json_loads(request.get_data().decode('utf-8'))
+ meta_data = data.get('metaData')
+ filter_value = data.get("filterValue")
+ strategy = GraphView._get_strategy(meta_data)
+ result = strategy.update_precision_error(meta_data, filter_value)
+ return http_util.Respond(request, result, "application/json")
# 展开关闭节点
@staticmethod
diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py
index 40d1e66120968c46dbfc7faba58322119f4077bb..21c5df98856789dfd6808e8d737c817a8f849f23 100644
--- a/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py
+++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py
@@ -82,6 +82,7 @@ class GraphsPlugin(base_plugin.TBPlugin):
'/saveData': GraphView.save_data,
'/updateColors': GraphView.update_colors,
'/saveMatchedRelations': GraphView.save_matched_relations,
+ '/updatePrecisionError': GraphView.update_precision_error,
}
def is_active(self):